On Wed, Feb 14, 2024 at 05:09:39PM +0100, Richard Biener wrote: > > > > Am 14.02.2024 um 16:22 schrieb Jakub Jelinek <ja...@redhat.com>: > > > > On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote: > >> The following removes the TBAA violation present in iterative_hash. > >> As we eventually LTO that it's important to fix. This also improves > >> code generation for the >= 12 bytes loop by using | to compose the > >> 4 byte words as at least GCC 7 and up can recognize that pattern > >> and perform a 4 byte load while the variant with a + is not > >> recognized (not on trunk either), I think we have an enhancement bug > >> for this somewhere. > >> > >> Given we reliably merge and the bogus "optimized" path might be > >> only relevant for archs that cannot do misaligned loads efficiently > >> I've chosen to keep a specialization for aligned accesses. > >> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk? > >> > >> Thanks, > >> Richard. > >> > >> libiberty/ > >> * hashtab.c (iterative_hash): Remove TBAA violating handling > >> of aligned little-endian case in favor of just keeping the > >> aligned case special-cased. Use | for composing a larger word. > > > > Have you tried using memcpy into a hashval_t temporary? > > Just wonder whether you get better or worse code with that compared to > > the shifts. > > I didn’t but I verified I get a single movd on x84-64 when using | instead of > + with GCC 7 and trunk.
Ok then. Jakub