Stack must be aligned to 8 bytes on PXA (possibly all armv5te) for LDRD/STRD
instructions. In case LDRD/STRD is issued on an unaligned address, the behaviour
is undefined.
The issue was observed when working with the NAND code, which was rendered
disfunctional. Also, the vsprintf() function had serious problems with printing
64bit wide long longs. After aligning the stack, this wrong behaviour is no
longer present.