On 09/06/2011 02:25 PM, Loïc Le Loarer wrote: > Hi Pádraig, > > Thank you for your answer. > > 2011/9/6 Pádraig Brady <p...@draigbrady.com <mailto:p...@draigbrady.com>> > > A few general points. > You essentially used Linus' code (albeit by > very helpfully isolating the significant differences). > It might be easier/required to just include it in gnulib? > There are a few files in gnulib that are not copyright of the FSF, > so would Nicolas and Linus need to assign copyright? > > > Yes, this is what I did. I don't thing that including Linus' is easier as the > functions have a different prototype. Also, sha1, sha256 and sha512 share the > same structure in gnulib, changing one without changing the other would be > weird. But if you thing it is required, I have not problem with that.
Ok, let's just use your patches to gnulib so. The techniques were fairly generic anyway. > > By the way, I have done a test on sha512 and I have improved the speed on the > same 1Gb zero file from 4.5 to 3.9s. Please find the patch attached. So I > thing that using the same technics, we could improve all sha's speed. > > For performance testing I've found gcc generates > much more deterministic results with a -march > as close to native as possible or otherwise > the code is very susceptible to alignment issues etc. > Your compiler supports -march=native. > Note also gcc 4.6 has much better support for your sandy bridge CPU, > either with -march=native or -march=corei7-avx > > > I tried using gcc-4.6.1 (I recompiled it under my ubuntu 10.10) but I > couldn't see any differences. For me, using any combination of -march=native > or not and gcc 4.4.5 or 4.6.1 doesn't make a difference, all the times are in > the measurement margin. OK that at least confirms the improvement is fairly deterministic. > > As for the SSE version, I would also like to see that included, > given the proportion of hardware supporting that these days. > I previously noticed a coreutils SSE2 patch here: > http://www.arctic.org/~dean/crypto/sha1.html > <http://www.arctic.org/%7Edean/crypto/sha1.html> > Though we'd probably need some runtime SSE detection to include that. > > > Ok, I could try to work on this. The real problem is to test that compilation > and SSE detection is done correctly on several platform. I only have access > to a few x86 machines, what is the usual way to test more platforms ? It would probably be best to get an account on the GCC compile farm. http://gcc.gnu.org/wiki/CompileFarm cheers, Pádraig.