On Mon, May 7, 2018 at 4:07 AM, Paul Eggert <egg...@cs.ucla.edu> wrote: > Bruno Haible wrote: >> >> Oops, I goofed with "git diff". Here's the correct patch to test. > > > I tried those bench-md5 benchmarks on two platforms, with somewhat more > disappointing results. > > I observed a real-time slowdown ranging from 11% (large buffers) to 22x > (small buffers) on Intel Xeon E3-1225 V2 (circa 2012 CPU), Ubuntu 16.04, > Linux 4.4.0, glibc 2.23. See attached file ubuntu1604.txt. > > I observed a real-time slowdown ranging from 8% (large buffers) to 43x > (small buffers) on AMD Phenom II X4 910e (circa 2010 CPU), Fedora 28, Linux > 4.16.5, glibc 2.27. See attached file fedora28.txt. > > These numbers compare somewhat unfavorably with your report, where the > real-time slowdown ranged from 1.5% (large buffers) to 25x (small buffers), > as reported in <https://lists.gnu.org/r/bug-gnulib/2018-05/msg00035.html>.
Hi all, I tried all the above, I can confirm the disappointing results with md5 or small buffers. This is what happens on my machine, a Lenovo Laptop with Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz running Fedora 27 with large buffers all the algos are faster but md5: $ without/gltests/bench-md5 1000000000 1 real 1.520719 user 1.520 sys 0.000 $ with/gltests/bench-md5 1000000000 1 real 1.684162 user 0.000 sys 1.684 $ without/gltests/bench-sha1 1000000000 1 real 1.696258 user 1.696 sys 0.000 $ with/gltests/bench-sha1 1000000000 1 real 1.072500 user 0.000 sys 1.072 $ without/gltests/bench-sha256 1000000000 1 real 4.467676 user 4.468 sys 0.000 $ with/gltests/bench-sha256 1000000000 1 real 2.527936 user 0.009 sys 2.519 $ without/gltests/bench-sha512 1000000000 1 real 2.684985 user 2.685 sys 0.000 $ with/gltests/bench-sha256 1000000000 1 real 2.546133 user 0.004 sys 2.542 While for sha1, af_alg become faster with buffers > 100k: $ without/gltests/bench-sha1 100 1000000 real 0.292869 user 0.293 sys 0.000 $ with/gltests/bench-sha1 100 1000000 real 9.153545 user 0.698 sys 8.421 $ without/gltests/bench-sha1 1000 100000 real 0.190652 user 0.191 sys 0.000 $ with/gltests/bench-sha1 1000 100000 real 1.033346 user 0.071 sys 0.963 $ without/gltests/bench-sha1 10000 10000 real 0.183897 user 0.184 sys 0.000 $ with/gltests/bench-sha1 10000 10000 real 0.214090 user 0.003 sys 0.212 $ without/gltests/bench-sha1 100000 1000 real 0.181184 user 0.181 sys 0.000 $ with/gltests/bench-sha1 100000 1000 real 0.131482 user 0.002 sys 0.130 $ without/gltests/bench-sha1 1000000 100 real 0.178751 user 0.179 sys 0.000 $ with/gltests/bench-sha1 1000000 100 real 0.122498 user 0.000 sha256 instead, become faster with af_alg with buffers > 10k: $ without/gltests/bench-sha256 100 1000000 real 0.617181 user 0.617 sys 0.000 $ with/gltests/bench-sha256 100 1000000 real 9.655386 user 0.703 sys 8.950 $ without/gltests/bench-sha256 1000 100000 real 0.470694 user 0.471 sys 0.000 $ with/gltests/bench-sha256 1000 100000 real 1.203199 user 0.091 sys 1.112 $ without/gltests/bench-sha256 10000 10000 real 0.459542 user 0.460 sys 0.000 $ with/gltests/bench-sha256 10000 10000 real 0.360933 user 0.003 sys 0.358 $ without/gltests/bench-sha256 100000 1000 real 0.454326 user 0.454 sys 0.000 $ with/gltests/bench-sha256 100000 1000 real 0.279998 user 0.000 sys 0.280 $ without/gltests/bench-sha256 1000000 100 real 0.451635 user 0.452 sys 0.000 $ with/gltests/bench-sha256 1000000 100 real 0.266343 user 0.001 sys 0.265 $ without/gltests/bench-sha256 10000000 10 real 0.443723 user 0.444 sys 0.000 $ with/gltests/bench-sha256 10000000 10 real 0.260270 user 0.000 sys 0.260 Keep in mind that I have the infamous patch to mitigate the Intel CPU bug, which adds a big overhead to syscalls, but it will hopefully disappear on future CPUs: $ dmesg |grep isolation [ 0.000000] Kernel/User page tables isolation: enabled -- Matteo Croce per aspera ad upstream