http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57315
--- Comment #4 from Vladimir Makarov <vmakarov at gcc dot gnu.org> --- Zack, thanks for reporting this. Crypto algorithms are very interesting cases for RA. A lot of performance improvements were done for RA during gcc-4.9 development. Now on Intel Haswell I have bash-4.2$ /home/cygnus/vmakarov/build/comparison/4.7-64/bin/gcc -std=c99 -O2 -march=native salsa-test.c && ./a.out 779.132 keys/s bash-4.2$ /home/cygnus/vmakarov/build/comparison/4.8-64/bin/gcc -std=c99 -O2 -march=native salsa-test.c && ./a.out 778.976 keys/s bash-4.2$ /home/cygnus/vmakarov/build1/trunk5/64r/bin/gcc -std=c99 -O2 -march=native salsa-test.c && ./a.out 1392.555 keys/s bash-4.2$ /home/cygnus/vmakarov/build/comparison/4.7-64/bin/gcc -std=c99 -O3 -fwhole-program -march=native salsa-test.c && ./a.out 1375.610 keys/s bash-4.2$ /home/cygnus/vmakarov/build/comparison/4.8-64/bin/gcc -std=c99 -O3 -fwhole-program -march=native salsa-test.c && ./a.out 1224.177 keys/s bash-4.2$ /home/cygnus/vmakarov/build1/trunk5/64r/bin/gcc -std=c99 -O3 -fwhole-program -march=native salsa-test.c && ./a.out 1436.539 keys/s Here, trunk5 is today GCC trunk. Unfortunately, the changes in RA are too big and can not be ported to gcc-4.8.