By saving a pointer to the tail of the list,
we don't need to traverse the entire command
queue before we're able to append an item to
it.
With this patch, I see a 10% improvement when
using the embedded XDS100v2 on AM437x IDK board
to load a 4MiB binary (linux zImage) to DDR
with load_image.