Re: [PATCH] AArch64: Cleanup alignment macros

Richard Sandiford Fri, 06 Dec 2024 09:52:31 -0800

Wilco Dijkstra <wilco.dijks...@arm.com> writes:
> Hi Richard,
>
>> So just to be sure I understand: we still want to align (say) an array
>> of 4 chars to 32 bits so that the LDR & STR are aligned, and an array of
>> 3 chars to 32 bits so that the LDRH & STRH for the leading two bytes are
>> aligned?  Is that right?  We don't seem to take advantage of the padding
>> and do an LDR & STR for the 3-byte case, either for globals or on the stack.
>
> Taking advantage of padding is possible within the compilation unit for
> data that is defined locally (and not interposable), and always with LTO.
>
>> If so, what's the advantage of aligning (say) a 6-byte array to 64 bits
>> rather than 32 bits, given that we don't use a 64-bit LDR & STR?
>> Could we save more with size < 64 instead of size <= 32?
>
> A common case is a constant string which is compared against some
> argument. Most string functions work on 8 or 16-byte quantities. If we
> ensure the whole array fits in one aligned load, we save time in the
> string function.
>
> Runtime data collected for strlen calls shows 97+% has 8-byte alignment
> or higher - this kind of overalignment helps achieving that.


Ah, ok.  But aren't we then losing that advantage for 4-byte arrays?
Or are you assuming a 4-byte path too?  Or is strlen just very unlikely
for such small data?

> There are likely some further tweaks we could do in the future: 1/2-byte
> objects are unlikely to benefit even from 4-byte alignment.

Yeah, was wondering about that too (but realised it was outside the
intended scope of the patch).

Thanks,
Richard

> And large objects may benefit from higher alignment (allowing 16-byte
> aligned LDP for loading values or faster memcpy of whole structs).
>
> Cheers,
> Wilco

Re: [PATCH] AArch64: Cleanup alignment macros

Reply via email to