On Thu, May 3, 2018 at 8:43 PM, Toon Moene <t...@moene.org> wrote: > Consider the attached Fortran code (the most expensive routine, > computation-wise, in our weather forecasting model). > > verint.s.7.3 is the result of: > > gfortran -g -O3 -S -march=native -mtune=native verint.f > > using release 7.3. > > verint.s.8.1 is the result of: > > gfortran -g -O3 -S -march=native -mtune=native verint.f > > using the recently released GCC 8.1. > > $ wc -l verint.s.7.3 verint.s.8.1 > 7818 verint.s.7.3 > 6087 verint.s.8.1 > > $ grep vfma verint.s.7.3 | wc -l > 381 > $ grep vfma verint.s.8.1 | wc -l > 254 > > but: > > $ grep vfma verint.s.7.3 | grep -v ss | wc -l > 127 > $ grep vfma verint.s.8.1 | grep -v ss | wc -l > 127 > > and: > > $ grep movaps verint.s.7.3 | wc -l > 306 > $ grep movaps verint.s.8.3 | wc -l > 270 > > Finally: > > $ grep zmm verint.s.7.3 | wc -l > 1494 > $ grep zmm verint.s.8.1 | wc -l > 0 > $ grep ymm verint.s.7.3 | wc -l > 379 > $ grep ymm verint.s.8.1 | wc -l > 1464 > > I haven't had the opportunity to test this for speed (is quite complicated, > as I have to build several support libraries with 8.1, like openmpi, netcdf, > hdf{4|5}, fftw ...)
GCC 8 has changes to prefer AVX256 by default for Skylake-avx512, even with AVX512 available. You can change that with -mprefer-vector-width=512 or by changing the avx256_optimal tune via -mtune-ctrl=^avx256_optimal There are now also measures in place to avoid fma in certain situations where it doesn't help performance. So - performance measurements would be nice to have ;) Richard. > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ > Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news