Re: [PATCH] AArch64: Cleanup alignment macros

Wilco Dijkstra Fri, 06 Dec 2024 09:47:39 -0800

Hi Richard,

> So just to be sure I understand: we still want to align (say) an array
> of 4 chars to 32 bits so that the LDR & STR are aligned, and an array of
> 3 chars to 32 bits so that the LDRH & STRH for the leading two bytes are
> aligned?  Is that right?  We don't seem to take advantage of the padding
> and do an LDR & STR for the 3-byte case, either for globals or on the stack.


Taking advantage of padding is possible within the compilation unit for
data that is defined locally (and not interposable), and always with LTO.

> If so, what's the advantage of aligning (say) a 6-byte array to 64 bits
> rather than 32 bits, given that we don't use a 64-bit LDR & STR?
> Could we save more with size < 64 instead of size <= 32?

A common case is a constant string which is compared against some
argument. Most string functions work on 8 or 16-byte quantities. If we
ensure the whole array fits in one aligned load, we save time in the
string function.

Runtime data collected for strlen calls shows 97+% has 8-byte alignment
or higher - this kind of overalignment helps achieving that.

There are likely some further tweaks we could do in the future: 1/2-byte
objects are unlikely to benefit even from 4-byte alignment. And large
objects may benefit from higher alignment (allowing 16-byte aligned
LDP for loading values or faster memcpy of whole structs).

Cheers,
Wilco

Re: [PATCH] AArch64: Cleanup alignment macros

Reply via email to