On Thu, Nov 23, 2023 at 1:49 AM Nathan Bossart <nathandboss...@gmail.com> wrote:
>
> On Wed, Nov 22, 2023 at 02:54:13PM +0200, Ants Aasma wrote:
> > For reference, executing the page checksum 10M times on a AMD 3900X CPU:
> >
> > clang-14 -O2                 4.292s (17.8 GiB/s)
> > clang-14 -O2 -msse4.1        2.859s (26.7 GiB/s)
> > clang-14 -O2 -msse4.1 -mavx2 1.378s (55.4 GiB/s)
>
> Nice.  I've noticed similar improvements with AVX2 intrinsics in simd.h.

If you're thinking to support AVX2 anywhere, I'd start with checksum
first. Much less code to review, and less risk.


Reply via email to