Modify ci_ep_alloc_request() to return a dynamically allocated request
object, rather than a singleton that's part of the endpoint. This
requires moving various state from the endpoint structure to the request
structure, since we need one copy per request.
The "fast bounce buffer" b_fast is removed by this change rather than
moved to the request object. Instead, we enhance the bounce buffer logic
in ci_bounce()/ci_debounce() to keep the bounce buffer around between
request submissions. This avoids the need to allocate an arbitrarily-
sized bounce buffer up-front, yet avoids incurring the allocation
overhead each time a request is submitted.
A future enhancement would be to actually submit multiple requests to HW
at once. The Linux driver shows that this is possible. That might improve
throughput (depending on the USB protocol in use), since USB could be
performing a transfer to one HW buffer in parallel with whatever SW
actions U-Boot performs on another buffer. However, I have not made this
change as part of this patch, in order to keep SW changes related to
buffer management separate from any change in the way the HW is
programmed.
Signed-off-by: Stephen Warren <swarren@nvidia.com>