[Bug libfortran/78379] Processor-specific versions for matmul

jvdelisle at gcc dot gnu.org Thu, 17 Nov 2016 11:58:02 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78379


Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jvdelisle at gcc dot gnu.org

--- Comment #3 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
I did apply your second patch:

I do not get any improvement and results are diminished from current trunk, so
I am missing something. This is same machine I used showing results in 51119.
It does have avx.

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf
eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave
avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse
3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext
perfctr_core perfctr_nb cpb hw_pstate vmmcall bmi1 arat npt lbrv svm_lock
nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter
pfthreshold

$ gfc -static-libgfortran -finline-matmul-limit=0 -Ofast -o compare_mavx
compare.f90
$ ./a.out 
 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  2000      5.043      0.045      0.091      0.150
    4  2000      1.417      0.235      0.353      0.325
    8  2000      2.016      0.634      0.862      2.021
   16  2000      5.332      2.834      2.239      2.929
   32  2000      6.169      3.496      1.931      3.289
   64  2000      2.656      2.836      2.655      2.657
  128  2000      2.898      3.286      2.901      2.901
  256   477      3.157      3.429      3.156      3.157
  512    59      3.082      2.356      3.133      3.126
 1024     7      3.102      1.363      3.144      3.136
 2048     1      3.099      1.685      3.144      3.140

[Bug libfortran/78379] Processor-specific versions for matmul

Reply via email to