Thanks for working on this.

Looks like my TODO list is reaping rewards :)

A few general points.
You essentially used Linus' code (albeit by
very helpfully isolating the significant differences).
It might be easier/required to just include it in gnulib?
There are a few files in gnulib that are not copyright of the FSF,
so would Nicolas and Linus need to assign copyright?

For performance testing I've found gcc generates
much more deterministic results with a -march
as close to native as possible or otherwise
the code is very susceptible to alignment issues etc.
Your compiler supports -march=native.
Note also gcc 4.6 has much better support for your sandy bridge CPU,
either with -march=native or -march=corei7-avx

As for the SSE version, I would also like to see that included,
given the proportion of hardware supporting that these days.
I previously noticed a coreutils SSE2 patch here:
http://www.arctic.org/~dean/crypto/sha1.html
Though we'd probably need some runtime SSE detection to include that.

cheers,
Pádraig.

Reply via email to