On Thu, May 3, 2018 at 8:43 PM, Toon Moene <t...@moene.org> wrote:
> Consider the attached Fortran code (the most expensive routine,
> computation-wise, in our weather forecasting model).
>
> verint.s.7.3 is the result of:
>
> gfortran -g -O3 -S -march=native -mtune=native verint.f
>
> using release 7.3.
>
> verint.s.8.1 is the result of:
>
> gfortran -g -O3 -S -march=native -mtune=native verint.f
>
> using the recently released GCC 8.1.
>
> $ wc -l verint.s.7.3 verint.s.8.1
>   7818 verint.s.7.3
>   6087 verint.s.8.1
>
> $ grep vfma verint.s.7.3 | wc -l
> 381
> $ grep vfma verint.s.8.1 | wc -l
> 254
>
> but:
>
> $ grep vfma verint.s.7.3 | grep -v ss |  wc -l
> 127
> $ grep vfma verint.s.8.1 | grep -v ss |  wc -l
> 127
>
> and:
>
> $ grep movaps verint.s.7.3 | wc -l
> 306
> $ grep movaps verint.s.8.3 | wc -l
> 270
>
> Finally:
>
> $ grep zmm verint.s.7.3 | wc -l
> 1494
> $ grep zmm verint.s.8.1 | wc -l
> 0
> $ grep ymm verint.s.7.3 | wc -l
> 379
> $ grep ymm verint.s.8.1 | wc -l
> 1464
>
> I haven't had the opportunity to test this for speed (is quite complicated,
> as I have to build several support libraries with 8.1, like openmpi, netcdf,
> hdf{4|5}, fftw ...)

GCC 8 has changes to prefer AVX256 by default for Skylake-avx512, even
with AVX512 available.
You can change that with -mprefer-vector-width=512 or by changing the
avx256_optimal tune via -mtune-ctrl=^avx256_optimal

There are now also measures in place to avoid fma in certain situations where it
doesn't help performance.

So - performance measurements would be nice to have ;)

Richard.

>
> --
> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
> Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Reply via email to