https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #1 from Arnd Bergmann <arnd at linaro dot org> ---
Before posting a new workaround for PR83356 (the workaround is to use -Os
instead of O2 for this file), I retested the performance numbers as well, and
got slightly different numbers this time. I don't know what caused that
difference, but now this is what I see is slightly different:


                      -O2     -Os
      gcc-6.3.1       14.9    15.1
      gcc-7.0.1       14.7    15.3
      gcc-7.1.1       15.3    14.7
      gcc-7.2.1       16.8    15.9
      gcc-8.0.0       15.5    15.6

In particular, the gcc-7.1.1 results are a bit worse than they were, leading to
a less significant regression from 7.1.1 to 7.2.2, and the numbers are now
closer to what I saw with libressl. In both cases, we still have a 5% to 9%
regression between gcc-7.1.1 (20170717) and gcc-7.2.1 (20180102), and a 14% to
23% regression between 6.3.1 and 7.2.1.

I also found my mistake in the libressl numbers I showed in comment #1, they
are listed exactly factor 3 higher than they should have been, and the actual
results are close to the kernel implementation. I've measure these again now as
well and come to the following results, using identical compilers as above:

                      -O2     -Os
      gcc-6.3.1       16.7    16.7
      gcc-7.0.1       17.5    16.0
      gcc-7.1.1       17.5    16.0
      gcc-7.2.1       17.6    16.0
      gcc-8.0.0       16.8    15.5

To reproduce with libressl, one could use the following steps:

$ git clone https://github.com/libressl-portable/portable.git
$ cd portable
$ ./autogen.sh
$ sed -i 's/undef FULL_UNROLL/define FULL_UNROLL/' crypto/aes/aes_locl.h
$ CC=x86_64-linux-gcc-7.2.1 ./configure --disable-asm
$ make -sj8
$ ./apps/openssl/openssl speed aes-256-cbc
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc     168004.61k   174024.74k   174855.76k   176270.13k   176608.14k
$ CC=x86_64-linux-gcc-6.3.1 ./configure --disable-asm
$ touch crypto/aes/aes_core.c 
$ make -sj8
$ ./apps/openssl/openssl speed aes-256-cbc
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc     175366.81k   182261.29k   183131.80k   184369.21k   184611.37k

Reply via email to