Wilco Dijkstra <wilco.dijks...@arm.com> writes: > Hi Richard, > >> So just to be sure I understand: we still want to align (say) an array >> of 4 chars to 32 bits so that the LDR & STR are aligned, and an array of >> 3 chars to 32 bits so that the LDRH & STRH for the leading two bytes are >> aligned? Is that right? We don't seem to take advantage of the padding >> and do an LDR & STR for the 3-byte case, either for globals or on the stack. > > Taking advantage of padding is possible within the compilation unit for > data that is defined locally (and not interposable), and always with LTO. > >> If so, what's the advantage of aligning (say) a 6-byte array to 64 bits >> rather than 32 bits, given that we don't use a 64-bit LDR & STR? >> Could we save more with size < 64 instead of size <= 32? > > A common case is a constant string which is compared against some > argument. Most string functions work on 8 or 16-byte quantities. If we > ensure the whole array fits in one aligned load, we save time in the > string function. > > Runtime data collected for strlen calls shows 97+% has 8-byte alignment > or higher - this kind of overalignment helps achieving that.
Ah, ok. But aren't we then losing that advantage for 4-byte arrays? Or are you assuming a 4-byte path too? Or is strlen just very unlikely for such small data? > There are likely some further tweaks we could do in the future: 1/2-byte > objects are unlikely to benefit even from 4-byte alignment. Yeah, was wondering about that too (but realised it was outside the intended scope of the patch). Thanks, Richard > And large objects may benefit from higher alignment (allowing 16-byte > aligned LDP for loading values or faster memcpy of whole structs). > > Cheers, > Wilco