Dear all, I have redone the SPEC2006 CPU FP tests again after adding "-march=native". Unfortunately, the results are not very good for the new if-converter. I believe this is the case because the CPU in question [details below] "only" has first-generation AVX, and, from what I`ve been told, at least AVX2 is needed for scatter/gather and/or masked loads/stores, and possibly even AVX512 [the 3rd generation]. As I have written before, in my opinion the new converter would be better than the old one if enough time and effort were to be spent on it, especially the time and effort to make it not add unneeded indirections.
First, I will give the totals. Then, I`ll give the CPU details for better understanding what "-march=native" did [or at least should have done]. Then, I`ll give the per-subtest numbers that Richard requested. For concision, I will use "Richard`s check-in" to refer to the GCC I built from Richard`s check-in dated July 10 2015 with Git SHA "cb791e75379bc0c8b10bd13bcb24305c36fd504b" and "git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@225652". [my reason for rebasing the relevant Git check-out to that point: quoting Richard`s check-in message: "PR tree-optimization/66823 * tree-if-conv.c (memrefs_read_or_written_unconditionally): Fix inverted predicate."] All the compilations were done with "-Ofast". The results, all integers, are the number of loops that were vectorized. Regards, Abe Richard`s check-in [i.e. *_old_* converter] no if-conversion-specific flags ------------------------------- 8374 Richard`s check-in [i.e. *_old_* converter] "-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores" ---------------------------------------------------------------- 8374 Richard`s check-in [i.e. *_old_* converter] both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores" ----------------------------------------------------------------- 8388 ---- patched version of Richard`s check-in [i.e. *_new_* converter] no if-conversion-specific flags ------------------------------------- 8275 patched version of Richard`s check-in [i.e. *_new_* converter] "-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores" ---------------------------------------------------------------- 8275 patched version of Richard`s check-in [i.e. *_new_* converter] both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores" ----------------------------------------------------------------- 8275 CPU [from "/proc/cpuinfo"] -------------------------- processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz stepping : 7 microcode : 0x710 cpu MHz : 2499.902 cache size : 15360 KB physical id : 0 siblings : 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 4999.80 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: [similarly for the cores numbered 1...23] kernel: 3.13.0-57-generic #95-Ubuntu SMP Fri Jun 19 09:28:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux Richard`s check-in [i.e. *_old_* converter] no if-conversion-specific flags ------------------------------- 410.bwaves: 13 416.gamess: 3837 433.milc: 7 434.zeusmp: 138 435.gromacs: 172 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2141 482.sphinx3: 58 998.specrand: 0 Richard`s check-in [i.e. *_old_* converter] "-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores" ---------------------------------------------------------------- 410.bwaves: 13 416.gamess: 3837 433.milc: 7 434.zeusmp: 138 435.gromacs: 172 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2141 482.sphinx3: 58 998.specrand: 0 Richard`s check-in [i.e. *_old_* converter] both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores" ----------------------------------------------------------------- 410.bwaves: 13 416.gamess: 3850 433.milc: 7 434.zeusmp: 138 435.gromacs: 173 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2141 482.sphinx3: 58 998.specrand: 0 ---- patched version of Richard`s check-in [i.e. *_new_* converter] no if-conversion-specific flags ------------------------------------- 410.bwaves: 13 416.gamess: 3804 433.milc: 7 434.zeusmp: 136 435.gromacs: 173 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2079 482.sphinx3: 55 998.specrand: 0 patched version of Richard`s check-in [i.e. *_new_* converter] "-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores" ---------------------------------------------------------------- 410.bwaves: 13 416.gamess: 3804 433.milc: 7 434.zeusmp: 136 435.gromacs: 173 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2079 482.sphinx3: 55 998.specrand: 0 patched version of Richard`s check-in [i.e. *_new_* converter] both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores" ----------------------------------------------------------------- 410.bwaves: 13 416.gamess: 3804 433.milc: 7 434.zeusmp: 136 435.gromacs: 173 436.cactusADM: 261 437.leslie3d: 92 444.namd: 0 450.soplex: 1 454.calculix: 436 459.GemsFDTD: 275 465.tonto: 943 470.lbm: 0 481.wrf: 2079 482.sphinx3: 55 998.specrand: 0