On Mon, Oct 14, 2024 at 2:26 PM Simon Josefsson via Gnulib discussion
list <bug-gnulib@gnu.org> wrote:
>
> Sam Russell <sam.h.russ...@gmail.com> writes:
>
> > I've noticed that GZIP trails behind zlib in performance and part of this
> > is down to the fact that zlib is using a more efficient CRC32
> > implementation. I've written an implementation of this for gnulib based off
> > the Intel paper at
> > https://static.aminer.org/pdf/PDF/000/432/446/a_systematic_approach_to_building_high_performance_software_based_crc.pdf
> > (the code is mine, written based on the paper, the tables are generated by
> > extending the code from RFC 1952 to generate the lookups for partial
> > bitfields, this can be provided on request but it's not my finest work).
> >
> > The code:
> > https://github.com/samrussell/gnulib/commit/2d5f5d0e131feea6e04cb48d56589537506f91a8
> > (yes, I am aware you don't take contributions via github. I have an open
> > ticket with GNU to get my SSH access fixed so I can have non-anonymous
> > access to the repository on Savannah).
>
> Thanks!  This looks nice, but please add code that generated the tables
> which is important for reproduction.
>
> I am worried about the size increase with the new tables, what do you
> think about making the new approach optional with some #ifdef, which may
> be off by default to prefer your new optimized variant?
>
> You make several changes to the test vectors: please make them as
> ADDITIONS instead.  We need some confidence that the old test vectors
> work.  Since the new code works via alignment, please add one test
> vector per string size: 0, 1, 2, 3, ... up to say 20.  Did you test this
> on 32-bit and 64-bit platforms?  Use valgrind to QA further.
Yes, +1. Changing the existing test was like poking me in the eye with
a finger. Please add additional tests!

> Maybe there is more room for optimization... SSE4.2+ has a hardware
> instruction for CRC.  Support for other CRC-32 would be nice too.  A
> reasonable specification for it would be nice too, I can find plenty of
> definitions in various RFC's but they are all duplicated.
> https://en.wikipedia.org/wiki/Cyclic_redundancy_check is informative.
> If you want to write an IETF RFC on this, maybe we can collaborate :)

If I recall correctly, SSE4.2 uses CRC polynomial 0x82F63B78. It may
not be the same as libc's polynomial. (The two big ones I am aware of
are CRC32 using polynomial 0xEDB88320 and CRC32-C using polynomial
0x82F63B78).

Jeff

Reply via email to