On Wed, Aug 7, 2024 at 7:41 AM Alexander Monakov <amona...@ispras.ru> wrote:
>
>
> On Tue, 6 Aug 2024, Alexander Monakov wrote:
>
> > --- a/libcpp/files.cc
> > +++ b/libcpp/files.cc
> [...]
> > +  pad = HAVE_AVX2 ? 32 : 16;
>
> This should have been
>
> #ifdef HAVE_AVX2
>   pad = 32;
> #else
>   pad = 16;
> #endif

OK with that change.

Did you think about a AVX512 version (possibly with 32 byte vectors)?
In case there's a more efficient variant of pshufb/pmovmskb available
there - possibly
the load on the branch unit could be lessened with using masking.

Thanks,
Richard.

> Alexander

Reply via email to