One issue I've noticed is that the crc functions take a char* and while this is ok in other implementations, gnulib appears to enforce strict alignment (presumably for portability purposes).
Because of this, we might need to introduce a new function that takes an unsigned long long* (and then _m128* when we do the SSE4.1 option) because casting a char* doesn't work. Do you know of any workarounds here? memcpy() to an unsigned long long* solves the alignment problem but adds a performance overhead. If the worst case is better than the existing algorithm then it might be worth it though... It would make sense to also offer a second function to allow callers to dump in a block of guaranteed-aligned memory. On Mon, Oct 14, 2024, 18:30 Bruno Haible <br...@clisp.org> wrote: > Sam Russell wrote: > > I built from HEAD, named it gzip_vanilla, rebuilt with my CRC code and > > named it gzip_8_slice. Input file is a 115MB file gzipped (default > > settings) to 61M > > Thanks; that's a comparison from which one can draw conclusions. > > > sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/gzip_8_slice -k -d -c > > large_file.gz > /dev/null > > > > real 0m0.319s > > user 0m0.316s > > sys 0m0.000s > > sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/gzip_vanilla -k -d -c > > large_file.gz > /dev/null > > > > real 0m0.485s > > user 0m0.484s > > sys 0m0.000s > > > > Looks to be about a ~35% reduction in time > > Wow! That's impressive. > > Definitely worth pursuing. > > Bruno > > > >