https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

--- Comment #32 from Andrew Roberts <andrewm.roberts at sky dot com> ---
For what its worth, here's what the latest and greatest from the competition
has to offer:

/usr/local/llvm-5.0.1-rc2/bin/clang -march=znver1 -mtune=znver1 -O3 matrix.c -o
matrix
        mult took     887141 clocks

/usr/local/llvm-5.0.1-rc2/biznver1 -O3 mt19937ar.c -o mt19937ar
mt19937ar took 402282 clocks

/usr/local/llvm-5.0.1-rc2/bin/clang -march=znver1 -mtune=znver1 -Ofast matrix.c
-o matrix
        mult took     760913 clocks

/usr/local/llvm-5.0.1-rc2/bin/clang -march=znver1 -mtune=znver1 -Ofast
mt19937ar.c -o mt19937ar
mt19937ar took 392527 clocks


current gcc-8 snapshot:
/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1  -Ofast matrix.c -o matrix
        mult took     364775 clocks

/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1  -Ofast -o mt19937ar
mt19937ar.c
mt19937ar took 430804 clocks

current gcc-8 snapshot + extra opts to improve znver1 performance
/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1 -mprefer-vector-width=none
-mno-fma -Ofast matrix.c -o matrix
        mult took     130329 clocks

/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1 -mno-avx2 -Ofast -o
mt19937ar mt19937ar.c
mt19937ar took 387728 clocks

So gcc loses on mt19937ar.c without -mno-avx2
But gcc wins big on matrix.c, especially with -mprefer-vector-width=none
-mno-fma

Reply via email to