On Wed, Oct 2, 2024 at 8:26 PM Jonathan Wakely <jwak...@redhat.com> wrote:
>
> On Wed, 2 Oct 2024 at 19:16, Jonathan Wakely <jwak...@redhat.com> wrote:
> >
> > On Wed, 2 Oct 2024 at 19:15, Dmitry Ilvokhin <d...@ilvokhin.com> wrote:
> > >
> > > Instead of looping over every byte of the tail, unroll loop manually
> > > using switch statement, then compilers (at least GCC and Clang) will
> > > generate a jump table [1], which is faster on a microbenchmark [2].
> > >
> > > [1]: https://godbolt.org/z/aE8Mq3j5G
> > > [2]: https://quick-bench.com/q/ylYLW2R22AZKRvameYYtbYxag24
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >         * libstdc++-v3/libsupc++/hash_bytes.cc (load_bytes): unroll
> > >           loop using switch statement.
> > >
> > > Signed-off-by: Dmitry Ilvokhin <d...@ilvokhin.com>
> > > ---
> > >  libstdc++-v3/libsupc++/hash_bytes.cc | 27 +++++++++++++++++++++++----
> > >  1 file changed, 23 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/libsupc++/hash_bytes.cc 
> > > b/libstdc++-v3/libsupc++/hash_bytes.cc
> > > index 3665375096a..294a7323dd0 100644
> > > --- a/libstdc++-v3/libsupc++/hash_bytes.cc
> > > +++ b/libstdc++-v3/libsupc++/hash_bytes.cc
> > > @@ -50,10 +50,29 @@ namespace
> > >    load_bytes(const char* p, int n)
> > >    {
> > >      std::size_t result = 0;
> > > -    --n;
> > > -    do
> > > -      result = (result << 8) + static_cast<unsigned char>(p[n]);
> > > -    while (--n >= 0);
> >
> > Don't we still need to loop, for the case where n >= 8? Otherwise we
> > only hash the first 8 bytes.
>
> Ah, but it's only ever called with load_bytes(end, len & 0x7)

The compiler should do such transforms - you probably want to tell
it that n < 8 though, it likely doesn't (always) know.

>
>
> >
> > > +    switch(n & 7)
> > > +      {
> > > +      case 7:
> > > +       result |= std::size_t(p[6]) << 48;
> > > +       [[gnu::fallthrough]];
> > > +      case 6:
> > > +       result |= std::size_t(p[5]) << 40;
> > > +       [[gnu::fallthrough]];
> > > +      case 5:
> > > +       result |= std::size_t(p[4]) << 32;
> > > +       [[gnu::fallthrough]];
> > > +      case 4:
> > > +       result |= std::size_t(p[3]) << 24;
> > > +       [[gnu::fallthrough]];
> > > +      case 3:
> > > +       result |= std::size_t(p[2]) << 16;
> > > +       [[gnu::fallthrough]];
> > > +      case 2:
> > > +       result |= std::size_t(p[1]) << 8;
> > > +       [[gnu::fallthrough]];
> > > +      case 1:
> > > +       result |= std::size_t(p[0]);
> > > +      };
> > >      return result;
> > >    }
> > >
> > > --
> > > 2.43.5
> > >
>

Reply via email to