https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78379
Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jvdelisle at gcc dot gnu.org --- Comment #3 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> --- I did apply your second patch: I do not get any improvement and results are diminished from current trunk, so I am missing something. This is same machine I used showing results in 51119. It does have avx. flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold $ gfc -static-libgfortran -finline-matmul-limit=0 -Ofast -o compare_mavx compare.f90 $ ./a.out ========================================================= ================ MEASURED GIGAFLOPS = ========================================================= Matmul Matmul fixed Matmul variable Size Loops explicit refMatmul assumed explicit ========================================================= 2 2000 5.043 0.045 0.091 0.150 4 2000 1.417 0.235 0.353 0.325 8 2000 2.016 0.634 0.862 2.021 16 2000 5.332 2.834 2.239 2.929 32 2000 6.169 3.496 1.931 3.289 64 2000 2.656 2.836 2.655 2.657 128 2000 2.898 3.286 2.901 2.901 256 477 3.157 3.429 3.156 3.157 512 59 3.082 2.356 3.133 3.126 1024 7 3.102 1.363 3.144 3.136 2048 1 3.099 1.685 3.144 3.140