One issue I've noticed is that the crc functions take a char* and while
this is ok in other implementations, gnulib appears to enforce strict
alignment (presumably for portability purposes).

Because of this, we might need to introduce a new function that takes an
unsigned long long* (and then _m128* when we do the SSE4.1 option) because
casting a char* doesn't work.

Do you know of any workarounds here? memcpy() to an unsigned long long*
solves the alignment problem but adds a performance overhead. If the worst
case is better than the existing algorithm then it might be worth it
though... It would make sense to also offer a second function to allow
callers to dump in a block of guaranteed-aligned memory.

On Mon, Oct 14, 2024, 18:30 Bruno Haible <br...@clisp.org> wrote:

> Sam Russell wrote:
> > I built from HEAD, named it gzip_vanilla, rebuilt with my CRC code and
> > named it gzip_8_slice. Input file is a 115MB file gzipped (default
> > settings) to 61M
>
> Thanks; that's a comparison from which one can draw conclusions.
>
> > sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/gzip_8_slice -k -d -c
> > large_file.gz > /dev/null
> >
> > real    0m0.319s
> > user    0m0.316s
> > sys     0m0.000s
> > sam@DESKTOP-R64B0KJ:~/gziptest$ time ../code/gzip/gzip_vanilla -k -d -c
> > large_file.gz > /dev/null
> >
> > real    0m0.485s
> > user    0m0.484s
> > sys     0m0.000s
> >
> > Looks to be about a ~35% reduction in time
>
> Wow! That's impressive.
>
> Definitely worth pursuing.
>
> Bruno
>
>
>
>

Reply via email to