https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72861
Bug ID: 72861 Summary: [7 Regression] 25% tramp3d-v4 performance regression on ppc64le Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: trippels at gcc dot gnu.org Target Milestone: --- Host: powerpc64le-unknown-linux-gnu Target: powerpc64le-unknown-linux-gnu Build: powerpc64le-unknown-linux-gnu Performance of tramp3d-v4 regressed more than 25% compared to gcc-6 on ppc64le (gcc112): gcc-6: trippels@gcc2-power8 ~ % ~/gcc_6/usr/local/bin/g++ -w -Ofast -mlra -mcpu=power8 tramp3d-v4.cpp Performance counter stats for './a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20' (5 runs): 1972.550946 task-clock (msec) # 0.999 CPUs utilized ( +- 0.22% ) 159 context-switches # 0.081 K/sec ( +- 0.90% ) 0 cpu-migrations # 0.000 K/sec 1,224 page-faults # 0.621 K/sec ( +- 0.02% ) 6,748,308,064 cycles # 3.421 GHz ( +- 0.22% ) [66.46%] 102,294,018 stalled-cycles-frontend # 1.52% frontend cycles idle ( +- 3.23% ) [49.91%] 4,241,962,795 stalled-cycles-backend # 62.86% backend cycles idle ( +- 0.42% ) [50.41%] 7,902,269,951 instructions # 1.17 insns per cycle # 0.54 stalled cycles per insn ( +- 0.17% ) [67.10%] 740,198,353 branches # 375.249 M/sec ( +- 0.12% ) [50.14%] 12,209,406 branch-misses # 1.65% of all branches ( +- 0.25% ) [49.82%] 1.973964281 seconds time elapsed ( +- 0.22% ) gcc-7: trippels@gcc2-power8 ~ % ~/gcc_7/usr/local/bin/g++ -w -Ofast -mlra -mcpu=power8 tramp3d-v4.cpp Performance counter stats for './a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20' (5 runs): 2677.865248 task-clock (msec) # 0.999 CPUs utilized ( +- 0.84% ) 163 context-switches # 0.061 K/sec ( +- 1.77% ) 0 cpu-migrations # 0.000 K/sec ( +-100.00% ) 2,092 page-faults # 0.781 K/sec ( +- 0.03% ) 9,149,015,944 cycles # 3.417 GHz ( +- 0.92% ) [66.65%] 105,804,553 stalled-cycles-frontend # 1.16% frontend cycles idle ( +- 5.21% ) [50.12%] 6,383,265,282 stalled-cycles-backend # 69.77% backend cycles idle ( +- 1.30% ) [50.31%] 8,980,496,614 instructions # 0.98 insns per cycle # 0.71 stalled cycles per insn ( +- 0.32% ) [66.96%] 682,369,238 branches # 254.818 M/sec ( +- 0.25% ) [49.93%] 10,159,864 branch-misses # 1.49% of all branches ( +- 0.61% ) [49.82%] 2.679415575 seconds time elapsed ( +- 0.84% )