https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
--- Comment #40 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> --- (In reply to Joost VandeVondele from comment #37) > (In reply to Joost VandeVondele from comment #36) > > #pragma GCC optimize ( "-Ofast -fvariable-expansion-in-unroller > > -funroll-loops" ) > Using: (I found it necessary to split into separate lines) #pragma GCC optimize ( "-Ofast" ) #pragma GCC optimize ( "-funroll-loops" ) #pragma GCC optimize ( "-fvariable-expansion-in-unroller" ) $ gfc -static -Ofast -finline-matmul-limit=0 compare.f90 [jerry@quasar pr51119]$ ./a.out ========================================================= ================ MEASURED GIGAFLOPS = ========================================================= Matmul Matmul fixed Matmul variable Size Loops explicit refMatmul assumed explicit ========================================================= 2 2000 0.055 0.048 0.042 0.055 4 2000 0.366 0.236 0.299 0.368 8 2000 0.628 0.673 1.610 1.833 16 2000 2.876 2.765 2.821 2.930 32 2000 4.681 3.382 4.812 4.763 64 2000 6.742 2.817 6.760 6.764 128 2000 8.532 3.194 7.852 8.539 256 477 9.420 3.319 9.053 9.420 512 59 8.435 2.358 8.319 8.390 1024 7 8.493 1.368 8.379 8.444 2048 1 8.499 1.666 8.385 8.448