On 14/10/2024 11:16, Sam Russell wrote:
Hi,

First off, this is my first GNU contribution so I have *no idea* what I'm 
doing, feedback is appreciated.

I've noticed that GZIP trails behind zlib in performance and part of this is down to 
the fact that zlib is using a more efficient CRC32 implementation. I've written an 
implementation of this for gnulib based off the Intel paper at 
https://static.aminer.org/pdf/PDF/000/432/446/a_systematic_approach_to_building_high_performance_software_based_crc.pdf
 
<https://static.aminer.org/pdf/PDF/000/432/446/a_systematic_approach_to_building_high_performance_software_based_crc.pdf>
 (the code is mine, written based on the paper, the tables are generated by extending 
the code from RFC 1952 to generate the lookups for partial bitfields, this can be 
provided on request but it's not my finest work).

The code:
https://github.com/samrussell/gnulib/commit/2d5f5d0e131feea6e04cb48d56589537506f91a8 
<https://github.com/samrussell/gnulib/commit/2d5f5d0e131feea6e04cb48d56589537506f91a8>
(yes, I am aware you don't take contributions via github. I have an open ticket 
with GNU to get my SSH access fixed so I can have non-anonymous access to the 
repository on Savannah).

Considerations:
- robustness/appropriateness: both the zlib and Linux kernel CRC32 
implementations use this slice-by-4 and slice-by-8 algorithm for the 
performance improvements
- free software/patents: the Intel paper is from 2008; zlib had an 
independently-discovered version of slice-by-4 from 2002 so it seems unlikely 
there should be any patents encumbering this. the software is my own design and 
I am happy for it to be transferred to GNU and licensed accordingly

Requests for help:
- how do I get someone to review my code and then get it added to the codebase? 
is this done via Savannah, or the mailing list, do we email around PATCH files 
etc?
- I assume we'll want to gate this functionality based on CPU ability (32-bit 
arithmetic is presumably fine but 64-bit makes no sense unless running on a 
system with a native 64-bit bus). What is the convention here for enforcing 
this at compile-time? I couldn't find the right flag to test for and I'm 
assuming there's a correct answer that someone knows off the top of their head

Thanks for your time,

Sam

(PS I sent this before I subscribed to the list so I'm assuming the old one 
just bounced, apologies if I end up creating 2 threads by mistake)

For reference, coreutils' cksum uses slice by 8 unconditionally since:
https://github.com/coreutils/coreutils/commit/a7533917e

Note we don't use it on systems with pclmul, since:
https://github.com/coreutils/coreutils/commit/4b9118cdb

cheers,
Pádraig

Reply via email to