Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
> Can't you do the CRC byte-by-byte until you reach an address that is properly aligned? > See lib/memchr.c for an example. That would be perfect, I'll look over lib/memchr.c. My background is in network engineering and reverse engineering so I am in my element with how everything looks on the

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Jeffrey Walton
On Mon, Oct 14, 2024 at 5:11 PM Sam Russell wrote: > > One issue I've noticed is that the crc functions take a char* and while this > is ok in other implementations, gnulib appears to enforce strict alignment > (presumably for portability purposes). > > Because of this, we might need to introduc

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Collin Funk
Sam Russell writes: > One issue I've noticed is that the crc functions take a char* and while > this is ok in other implementations, gnulib appears to enforce strict > alignment (presumably for portability purposes). > > Because of this, we might need to introduce a new function that takes an > u

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
One issue I've noticed is that the crc functions take a char* and while this is ok in other implementations, gnulib appears to enforce strict alignment (presumably for portability purposes). Because of this, we might need to introduce a new function that takes an unsigned long long* (and then _m12

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
> Thanks! This looks nice, but please add code that generated the tables > which is important for reproduction. Of course. > I am worried about the size increase with the new tables, what do you > think about making the new approach optional with some #ifdef, which may > be off by default to pre

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Jeffrey Walton
On Mon, Oct 14, 2024 at 2:26 PM Simon Josefsson via Gnulib discussion list wrote: > > Sam Russell writes: > > > I've noticed that GZIP trails behind zlib in performance and part of this > > is down to the fact that zlib is using a more efficient CRC32 > > implementation. I've written an implement

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Simon Josefsson via Gnulib discussion list
Sam Russell writes: > I've noticed that GZIP trails behind zlib in performance and part of this > is down to the fact that zlib is using a more efficient CRC32 > implementation. I've written an implementation of this for gnulib based off > the Intel paper at > https://static.aminer.org/pdf/PDF/00

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
That sounds good. It looks like there some subtle differences anyway, the gzip version does everything bit reversed, and while the intel paper has constants for that there are some logic things that would have to change (take hiword then loword instead of loword then hiword etc) On Mon, Oct 14, 20

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Pádraig Brady
On 14/10/2024 15:53, Sam Russell wrote: > For reference, coreutils' cksum uses slice by 8 unconditionally since: > https://github.com/coreutils/coreutils/commit/a7533917e perfect, we could just copy this across then? is there a reason

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Bruno Haible via Gnulib discussion list
Sam Russell wrote: > I built from HEAD, named it gzip_vanilla, rebuilt with my CRC code and > named it gzip_8_slice. Input file is a 115MB file gzipped (default > settings) to 61M Thanks; that's a comparison from which one can draw conclusions. > sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
I built from HEAD, named it gzip_vanilla, rebuilt with my CRC code and named it gzip_8_slice. Input file is a 115MB file gzipped (default settings) to 61M sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/gzip_8_slice -k -d -c large_file.gz > /dev/null real0m0.307s user0m0.298s sys 0m

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Bruno Haible via Gnulib discussion list
Sam Russell wrote: > As for your question on speed, I noticed between zstd (which uses zlib as a > backend) and gzip there seems to be an improvement of maybe 30-40% for > decompressing a 100MB file (part of this is due to multithreading though), If you compare two scenarios which differ in 4 aspe

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Jim Meyering
On Mon, Oct 14, 2024 at 6:53 AM Bruno Haible via Gnulib discussion list wrote: > Hi Sam, > > Thanks for the contribution offer! > > > I've noticed that GZIP trails behind zlib in performance and part of this > > is down to the fact that zlib is using a more efficient CRC32 > > implementation. > >

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
Hi Bruno, Presumably you've read Pádraig's comment in the other thread that I mistakenly created, there are two interesting things from this: - coreutils is already GNU so no copyright review required, although the code appears to be inside the cksum utility so it's not in a position to be includ

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
> For reference, coreutils' cksum uses slice by 8 unconditionally since: > https://github.com/coreutils/coreutils/commit/a7533917e perfect, we could just copy this across then? is there a reason gnulib wouldn't just include coreutils as a dependency? > Note we don't use it on systems with pclmul,

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Bruno Haible via Gnulib discussion list
Hi Sam, Thanks for the contribution offer! > I've noticed that GZIP trails behind zlib in performance and part of this > is down to the fact that zlib is using a more efficient CRC32 > implementation. How much of a speedup do you obtain in gzip overall (not in CRC32 alone) for large files, throu

Re: Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Pádraig Brady
On 14/10/2024 11:16, Sam Russell wrote: Hi, First off, this is my first GNU contribution so I have *no idea* what I'm doing, feedback is appreciated. I've noticed that GZIP trails behind zlib in performance and part of this is down to the fact that zlib is using a more efficient CRC32 impleme

Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
Hi, First off, this is my first GNU contribution so I have *no idea* what I'm doing, feedback is appreciated. I've noticed that GZIP trails behind zlib in performance and part of this is down to the fact that zlib is using a more efficient CRC32 implementation. I've written an implementation of t

Adding slice-by-4 and slice-by-8 to CRC32

2024-10-14 Thread Sam Russell
Hi, First off, this is my first GNU contribution so I have *no idea* what I'm doing, feedback is appreciated. I've noticed that GZIP trails behind zlib in performance and part of this is down to the fact that zlib is using a more efficient CRC32 implementation. I've written an implementation of t