https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #54 from Rogério de Souza Moraes <rogerio.souza at gmail dot com> 
---
Hi everyone,

we continue to get performance degradation as we use a new compiler version to
build the source code 'testcase-v1.c" that is available in the attachment
above.

Could you please check if the test case that I attached reproduces the same
issue, or it is a different issue?

Below are the results that I got:

# GCC 4.4.7
-bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign
-mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o

Execution times (seconds)
 callgraph construction:   0.02 ( 0%) usr   0.01 ( 4%) sys   0.03 ( 0%) wall   
2535 kB ( 1%) ggc
 callgraph optimization:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  11 kB ( 0%) ggc
 cfg cleanup           :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.07 ( 1%) wall   
 151 kB ( 0%) ggc
 trivially dead code   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   0.40 ( 4%) usr   0.01 ( 4%) sys   0.40 ( 4%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.33 ( 3%) usr   0.00 ( 0%) sys   0.37 ( 4%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   0.08 ( 1%) usr   0.00 ( 0%) sys   0.12 ( 1%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   0.17 ( 2%) usr   0.00 ( 0%) sys   0.16 ( 2%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.25 ( 3%) usr   0.00 ( 0%) sys   0.22 ( 2%) wall 
  2519 kB ( 1%) ggc
 register information  :   0.13 ( 1%) usr   0.00 ( 0%) sys   0.11 ( 1%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.06 ( 1%) usr   0.00 ( 0%) sys   0.08 ( 1%) wall   
1952 kB ( 1%) ggc
 register scan         :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.06 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing         :   0.02 ( 0%) usr   0.02 ( 7%) sys   0.02 ( 0%) wall   
 466 kB ( 0%) ggc
 lexical analysis      :   0.03 ( 0%) usr   0.02 ( 7%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   0.06 ( 1%) usr   0.07 (26%) sys   0.19 ( 2%) wall  
18164 kB ( 9%) ggc
 inline heuristics     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.07 ( 1%) usr   0.00 ( 0%) sys   0.06 ( 1%) wall  
19793 kB (10%) ggc
 tree CFG construction :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
4445 kB ( 2%) ggc
 tree CFG cleanup      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 1%) wall   
1209 kB ( 1%) ggc
 tree VRP              :   0.20 ( 2%) usr   0.00 ( 0%) sys   0.20 ( 2%) wall  
12984 kB ( 6%) ggc
 tree copy propagation :   0.10 ( 1%) usr   0.00 ( 0%) sys   0.05 ( 1%) wall   
   9 kB ( 0%) ggc
 tree find ref. vars   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
1190 kB ( 1%) ggc
 tree PTA              :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.06 ( 1%) wall   
  33 kB ( 0%) ggc
 tree alias analysis   :   0.04 ( 0%) usr   0.01 ( 4%) sys   0.03 ( 0%) wall   
 308 kB ( 0%) ggc
 tree flow sensitive alias:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
      0 kB ( 0%) ggc
 tree PHI insertion    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1349 kB ( 1%) ggc
 tree SSA rewrite      :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.08 ( 1%) wall  
16607 kB ( 8%) ggc
 tree SSA other        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   0.11 ( 1%) usr   0.01 ( 4%) sys   0.06 ( 1%) wall   
5142 kB ( 3%) ggc
 tree operand scan     :   0.11 ( 1%) usr   0.03 (11%) sys   0.12 ( 1%) wall  
10620 kB ( 5%) ggc
 dominator optimization:   0.08 ( 1%) usr   0.00 ( 0%) sys   0.11 ( 1%) wall   
2130 kB ( 1%) ggc
 tree CCP              :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 1%) wall   
   6 kB ( 0%) ggc
 tree split crit edges :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
6801 kB ( 3%) ggc
 tree reassociation    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   2 kB ( 0%) ggc
 tree PRE              :   2.94 (31%) usr   0.02 ( 7%) sys   2.99 (30%) wall   
4955 kB ( 2%) ggc
 tree FRE              :   0.16 ( 2%) usr   0.01 ( 4%) sys   0.17 ( 2%) wall   
1933 kB ( 1%) ggc
 tree code sinking     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   1 kB ( 0%) ggc
 tree forward propagate:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 228 kB ( 0%) ggc
 PHI merge             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 complete unrolling    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   2 kB ( 0%) ggc
 tree SSA uncprop      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :   0.07 ( 1%) usr   0.00 ( 0%) sys   0.09 ( 1%) wall  
15109 kB ( 7%) ggc
 tree rename SSA copies:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree switch initialization conversion:   0.01 ( 0%) usr   0.00 ( 0%) sys  
0.00 ( 0%) wall       0 kB ( 0%) ggc
 dominance frontiers   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 control dependences   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :   0.49 ( 5%) usr   0.02 ( 7%) sys   0.50 ( 5%) wall  
34138 kB (17%) ggc
 jump                  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
1556 kB ( 1%) ggc
 CSE                   :   0.13 ( 1%) usr   0.01 ( 4%) sys   0.16 ( 2%) wall   
 606 kB ( 0%) ggc
 dead code elimination :   0.09 ( 1%) usr   0.00 ( 0%) sys   0.07 ( 1%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   0.10 ( 1%) usr   0.00 ( 0%) sys   0.09 ( 1%) wall   
1706 kB ( 1%) ggc
 dead store elim2      :   0.12 ( 1%) usr   0.00 ( 0%) sys   0.12 ( 1%) wall   
4207 kB ( 2%) ggc
 CSE 2                 :   0.11 ( 1%) usr   0.00 ( 0%) sys   0.08 ( 1%) wall   
 303 kB ( 0%) ggc
 branch prediction     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   2 kB ( 0%) ggc
 combiner              :   0.15 ( 2%) usr   0.01 ( 4%) sys   0.14 ( 1%) wall   
4423 kB ( 2%) ggc
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 154 kB ( 0%) ggc
 regmove               :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 integrated RA         :   0.90 ( 9%) usr   0.00 ( 0%) sys   0.89 ( 9%) wall   
1788 kB ( 1%) ggc
 reload                :   0.39 ( 4%) usr   0.00 ( 0%) sys   0.38 ( 4%) wall   
6285 kB ( 3%) ggc
 reload CSE regs       :   0.20 ( 2%) usr   0.01 ( 4%) sys   0.20 ( 2%) wall   
8333 kB ( 4%) ggc
 load CSE after reload :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.06 ( 1%) usr   0.00 ( 0%) sys   0.06 ( 1%) wall   
  13 kB ( 0%) ggc
 peephole 2            :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 rename registers      :   0.06 ( 1%) usr   0.00 ( 0%) sys   0.08 ( 1%) wall   
  75 kB ( 0%) ggc
 scheduling 2          :   0.35 ( 4%) usr   0.01 ( 4%) sys   0.35 ( 4%) wall   
   0 kB ( 0%) ggc
 machine dep reorg     :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.06 ( 1%) wall   
 301 kB ( 0%) ggc
 reorder blocks        :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 1%) wall   
3708 kB ( 2%) ggc
 final                 :   0.11 ( 1%) usr   0.00 ( 0%) sys   0.13 ( 1%) wall   
3977 kB ( 2%) ggc
 TOTAL                 :   9.48             0.27             9.83            
203289 kB

####################################################################################
#GCC 6.3.1
bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign
-mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o

Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   892 kB ( 0%) ggc
 phase parsing           :   0.21 ( 0%) usr   0.09 ( 3%) sys   0.32 ( 0%) wall 
 20284 kB ( 1%) ggc
 phase opt and generate  : 120.50 (100%) usr   3.43 (97%) sys 125.05 (100%)
wall 3091168 kB (99%) ggc
 garbage collection      :   2.50 ( 2%) usr   0.02 ( 1%) sys   2.90 ( 2%) wall 
     0 kB ( 0%) ggc
 callgraph construction  :   1.81 ( 1%) usr   0.08 ( 2%) sys   2.33 ( 2%) wall 
  8824 kB ( 0%) ggc
 callgraph optimization  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     8 kB ( 0%) ggc
 ipa dead code removal   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa cp                  :   0.99 ( 1%) usr   0.00 ( 0%) sys   0.99 ( 1%) wall 
   512 kB ( 0%) ggc
 ipa inlining heuristics :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
     9 kB ( 0%) ggc
 ipa profile             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa pure const          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa icf                 :   5.44 ( 5%) usr   0.00 ( 0%) sys   5.44 ( 4%) wall 
    26 kB ( 0%) ggc
 cfg cleanup             :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
    18 kB ( 0%) ggc
 trivially dead code     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 df scan insns           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 df multiple defs        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 df reaching defs        :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.10 ( 0%) wall 
     0 kB ( 0%) ggc
 df live regs            :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall 
     0 kB ( 0%) ggc
 df live&initialized regs:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 df must-initialized regs:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall 
   665 kB ( 0%) ggc
 register information    :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 alias analysis          :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
  1090 kB ( 0%) ggc
 alias stmt walking      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
     1 kB ( 0%) ggc
 rebuild jump labels     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 preprocessing           :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall 
  3645 kB ( 0%) ggc
 lexical analysis        :   0.07 ( 0%) usr   0.03 ( 1%) sys   0.09 ( 0%) wall 
     0 kB ( 0%) ggc
 parser (global)         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   643 kB ( 0%) ggc
 parser function body    :   0.11 ( 0%) usr   0.04 ( 1%) sys   0.19 ( 0%) wall 
 15822 kB ( 1%) ggc
 parser inl. func. body  :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall 
   110 kB ( 0%) ggc
 inline parameters       :   0.13 ( 0%) usr   0.02 ( 1%) sys   0.09 ( 0%) wall 
  1530 kB ( 0%) ggc
 tree gimplify           :   0.06 ( 0%) usr   0.02 ( 1%) sys   0.06 ( 0%) wall 
 12544 kB ( 0%) ggc
 tree eh                 :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 tree CFG construction   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
 10956 kB ( 0%) ggc
 tree CFG cleanup        :   0.45 ( 0%) usr   0.01 ( 0%) sys   0.46 ( 0%) wall 
     0 kB ( 0%) ggc
 tree VRP                :  25.14 (21%) usr   0.56 (16%) sys  25.73 (21%) wall 
115035 kB ( 4%) ggc
 tree copy propagation   :   0.57 ( 0%) usr   0.00 ( 0%) sys   0.57 ( 0%) wall 
     0 kB ( 0%) ggc
 tree PTA                :  18.90 (16%) usr   0.63 (18%) sys  19.58 (16%) wall 
   488 kB ( 0%) ggc
 tree PHI insertion      :   3.90 ( 3%) usr   1.67 (47%) sys   5.58 ( 4%) wall
2912441 kB (94%) ggc
 tree SSA rewrite        :  12.59 (10%) usr   0.01 ( 0%) sys  12.62 (10%) wall 
  4709 kB ( 0%) ggc
 tree SSA other          :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA incremental    :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
    94 kB ( 0%) ggc
 tree operand scan       :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall 
  3523 kB ( 0%) ggc
 dominator optimization  :   1.49 ( 1%) usr   0.00 ( 0%) sys   1.50 ( 1%) wall 
   378 kB ( 0%) ggc
 isolate eroneous paths  :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
     0 kB ( 0%) ggc
 tree CCP                :   9.87 ( 8%) usr   0.03 ( 1%) sys   9.94 ( 8%) wall 
    12 kB ( 0%) ggc
 tree PHI const/copy prop:   0.26 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
     0 kB ( 0%) ggc
 tree split crit edges   :   0.47 ( 0%) usr   0.00 ( 0%) sys   0.48 ( 0%) wall 
   568 kB ( 0%) ggc
 tree reassociation      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 tree PRE                :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall 
     6 kB ( 0%) ggc
 tree FRE                :   1.47 ( 1%) usr   0.00 ( 0%) sys   1.48 ( 1%) wall 
     6 kB ( 0%) ggc
 tree backward propagate :  11.39 ( 9%) usr   0.00 ( 0%) sys  11.40 ( 9%) wall 
     0 kB ( 0%) ggc
 tree forward propagate  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 tree phiprop            :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
     0 kB ( 0%) ggc
 tree conservative DCE   :   0.78 ( 1%) usr   0.01 ( 0%) sys   0.80 ( 1%) wall 
     0 kB ( 0%) ggc
 tree aggressive DCE     :   2.56 ( 2%) usr   0.00 ( 0%) sys   2.53 ( 2%) wall 
    12 kB ( 0%) ggc
 tree DSE                :   0.67 ( 1%) usr   0.00 ( 0%) sys   0.68 ( 1%) wall 
     0 kB ( 0%) ggc
 PHI merge               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 tree slp vectorization  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   283 kB ( 0%) ggc
 tree SSA uncprop        :   0.83 ( 1%) usr   0.02 ( 1%) sys   0.85 ( 1%) wall 
     0 kB ( 0%) ggc
 tree strlen optimization:   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance computation   :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 out of ssa              :  10.30 ( 9%) usr   0.24 ( 7%) sys  10.60 ( 8%) wall 
    47 kB ( 0%) ggc
 expand vars             :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   428 kB ( 0%) ggc
 expand                  :   0.22 ( 0%) usr   0.02 ( 1%) sys   0.26 ( 0%) wall 
  5452 kB ( 0%) ggc
 lower subreg            :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 forward prop            :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   246 kB ( 0%) ggc
 CSE                     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
   145 kB ( 0%) ggc
 dead code elimination   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 dead store elim1        :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall 
   720 kB ( 0%) ggc
 dead store elim2        :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
  1042 kB ( 0%) ggc
 loop init               :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
    17 kB ( 0%) ggc
 CSE 2                   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
    56 kB ( 0%) ggc
 branch prediction       :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 combiner                :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
   965 kB ( 0%) ggc
 integrated RA           :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall 
  2130 kB ( 0%) ggc
 LRA non-specific        :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
   746 kB ( 0%) ggc
 LRA virtuals elimination:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   461 kB ( 0%) ggc
 LRA reload inheritance  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   132 kB ( 0%) ggc
 LRA create live ranges  :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall 
    87 kB ( 0%) ggc
 LRA hard reg assignment :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 LRA rematerialization   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 reload CSE regs         :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
  1865 kB ( 0%) ggc
 load CSE after reload   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall 
   138 kB ( 0%) ggc
 ree                     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 thread pro- & epilogue  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     4 kB ( 0%) ggc
 if-conversion 2         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 combine stack adjustments:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
     37 kB ( 0%) ggc
 peephole 2              :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   348 kB ( 0%) ggc
 hard reg cprop          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 scheduling 2            :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
   320 kB ( 0%) ggc
 reorder blocks          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   528 kB ( 0%) ggc
 shorten branches        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 final                   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
  1302 kB ( 0%) ggc
 straight-line strength reduction:   0.14 ( 0%) usr   0.01 ( 0%) sys   0.15 (
0%) wall       0 kB ( 0%) ggc
 rest of compilation     :   0.12 ( 0%) usr   0.01 ( 0%) sys   0.16 ( 0%) wall 
    33 kB ( 0%) ggc
 remove unused locals    :   3.42 ( 3%) usr   0.01 ( 0%) sys   3.46 ( 3%) wall 
     0 kB ( 0%) ggc
 address taken           :   1.19 ( 1%) usr   0.01 ( 0%) sys   1.21 ( 1%) wall 
     0 kB ( 0%) ggc
 unaccounted todo        :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
    40 kB ( 0%) ggc
 TOTAL                 : 120.71             3.52           125.38           
3112354 kB

####################################################################################

# GCC 8.3.1
bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign
-mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o

Time variable                                   usr           sys          wall
              GGC
 phase setup                        :   0.00 (  0%)   0.01 (  0%)   0.01 (  0%)
   1047 kB (  0%)
 phase parsing                      :   0.21 (  0%)   0.08 (  0%)   0.32 (  0%)
  20551 kB (  0%)
 phase opt and generate             : 189.32 (100%)  24.44 (100%) 367.57 (100%)
5277730 kB (100%)
 garbage collection                 :   7.52 (  4%)   2.16 (  9%)  32.80 (  9%)
      0 kB (  0%)
 dump files                         :   0.00 (  0%)   0.01 (  0%)   0.02 (  0%)
      0 kB (  0%)
 callgraph construction             :   4.62 (  2%)   4.29 ( 17%)  59.37 ( 16%)
   9442 kB (  0%)
 callgraph optimization             :   0.28 (  0%)   0.04 (  0%)   0.60 (  0%)
      0 kB (  0%)
 ipa function summary               :   0.13 (  0%)   0.22 (  1%)   2.63 (  1%)
     11 kB (  0%)
 ipa dead code removal              :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 ipa cp                             :   2.11 (  1%)   2.48 ( 10%)  30.08 (  8%)
      2 kB (  0%)
 ipa inlining heuristics            :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 ipa pure const                     :   0.02 (  0%)   0.01 (  0%)   0.05 (  0%)
      0 kB (  0%)
 ipa icf                            :  12.01 (  6%)   4.13 ( 17%)  60.34 ( 16%)
     29 kB (  0%)
 ipa free inline summary            :   0.00 (  0%)   0.01 (  0%)   0.07 (  0%)
      0 kB (  0%)
 cfg cleanup                        :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)
     18 kB (  0%)
 trivially dead code                :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 df scan insns                      :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 df multiple defs                   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 df reaching defs                   :   0.08 (  0%)   0.00 (  0%)   0.08 (  0%)
      0 kB (  0%)
 df live regs                       :   0.11 (  0%)   0.00 (  0%)   0.12 (  0%)
      0 kB (  0%)
 df live&initialized regs           :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
      0 kB (  0%)
 df must-initialized regs           :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 df use-def / def-use chains        :   0.05 (  0%)   0.00 (  0%)   0.04 (  0%)
      0 kB (  0%)
 df reg dead/unused notes           :   0.11 (  0%)   0.00 (  0%)   0.12 (  0%)
    684 kB (  0%)
 register information               :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
      0 kB (  0%)
 alias analysis                     :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
   1090 kB (  0%)
 alias stmt walking                 :   0.13 (  0%)   0.02 (  0%)   0.36 (  0%)
      1 kB (  0%)
 register scan                      :   0.02 (  0%)   0.00 (  0%)   0.00 (  0%)
    140 kB (  0%)
 rebuild jump labels                :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 preprocessing                      :   0.04 (  0%)   0.01 (  0%)   0.11 (  0%)
   3722 kB (  0%)
 lexical analysis                   :   0.04 (  0%)   0.02 (  0%)   0.09 (  0%)
      0 kB (  0%)
 parser (global)                    :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    716 kB (  0%)
 parser struct body                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
     61 kB (  0%)
 parser function body               :   0.13 (  0%)   0.05 (  0%)   0.10 (  0%)
  15942 kB (  0%)
 early inlining heuristics          :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 inline parameters                  :   0.16 (  0%)   0.19 (  1%)   0.47 (  0%)
   1538 kB (  0%)
 tree gimplify                      :   0.06 (  0%)   0.01 (  0%)   0.07 (  0%)
  14496 kB (  0%)
 tree eh                            :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 tree CFG construction              :   0.04 (  0%)   0.00 (  0%)   0.06 (  0%)
  11884 kB (  0%)
 tree CFG cleanup                   :   0.64 (  0%)   0.00 (  0%)   0.65 (  0%)
      0 kB (  0%)
 tree tail merge                    :   1.50 (  1%)   0.00 (  0%)   1.50 (  0%)
      0 kB (  0%)
 tree VRP                           :   9.73 (  5%)   2.02 (  8%)  12.15 (  3%)
 165762 kB (  3%)
 tree Early VRP                     :  13.07 (  7%)   0.23 (  1%)  13.32 (  4%)
 496545 kB (  9%)
 tree copy propagation              :   0.88 (  0%)   0.00 (  0%)   0.89 (  0%)
      0 kB (  0%)
 tree PTA                           :  28.88 ( 15%)   4.22 ( 17%)  38.06 ( 10%)
  46436 kB (  1%)
 tree PHI insertion                 :   5.78 (  3%)   1.96 (  8%)   7.83 (  2%)
4254415 kB ( 80%)
 tree SSA rewrite                   :  15.12 (  8%)   0.02 (  0%)  15.13 (  4%)
   3780 kB (  0%)
 tree SSA other                     :   0.21 (  0%)   0.16 (  1%)   0.34 (  0%)
      0 kB (  0%)
 tree SSA incremental               :   0.14 (  0%)   0.00 (  0%)   0.16 (  0%)
     94 kB (  0%)
 tree operand scan                  :   0.03 (  0%)   0.00 (  0%)   0.04 (  0%)
   3619 kB (  0%)
 dominator optimization             :   5.51 (  3%)   0.10 (  0%)   5.66 (  2%)
 165892 kB (  3%)
 backwards jump threading           :   0.01 (  0%)   0.01 (  0%)   0.01 (  0%)
      0 kB (  0%)
 isolate eroneous paths             :   0.10 (  0%)   0.00 (  0%)   0.09 (  0%)
      0 kB (  0%)
 tree CCP                           :  14.39 (  8%)   1.75 (  7%)  16.89 (  5%)
     75 kB (  0%)
 tree PHI const/copy prop           :   0.43 (  0%)   0.00 (  0%)   0.43 (  0%)
      0 kB (  0%)
 tree split crit edges              :   0.40 (  0%)   0.01 (  0%)   0.44 (  0%)
    274 kB (  0%)
 tree reassociation                 :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 tree PRE                           :   3.41 (  2%)   0.14 (  1%)   3.55 (  1%)
    281 kB (  0%)
 tree FRE                           :  20.53 ( 11%)   0.04 (  0%)  20.61 (  6%)
      5 kB (  0%)
 tree linearize phis                :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      6 kB (  0%)
 tree backward propagate            :   1.41 (  1%)   0.04 (  0%)   1.92 (  1%)
      0 kB (  0%)
 tree forward propagate             :   3.69 (  2%)   0.00 (  0%)   3.70 (  1%)
      0 kB (  0%)
 tree phiprop                       :   0.10 (  0%)   0.00 (  0%)   0.09 (  0%)
      0 kB (  0%)
 tree conservative DCE              :   1.63 (  1%)   0.00 (  0%)   1.66 (  0%)
      0 kB (  0%)
 tree aggressive DCE                :   3.75 (  2%)   0.01 (  0%)   3.77 (  1%)
     12 kB (  0%)
 tree DSE                           :   1.09 (  1%)   0.00 (  0%)   1.07 (  0%)
      0 kB (  0%)
 PHI merge                          :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 tree slp vectorization             :   0.02 (  0%)   0.00 (  0%)   0.04 (  0%)
    747 kB (  0%)
 tree SSA uncprop                   :   1.44 (  1%)   0.00 (  0%)   1.44 (  0%)
      0 kB (  0%)
 gimple widening/fma detection      :   0.13 (  0%)   0.00 (  0%)   0.13 (  0%)
      0 kB (  0%)
 tree strlen optimization           :   0.19 (  0%)   0.00 (  0%)   0.19 (  0%)
      0 kB (  0%)
 dominance computation              :   0.03 (  0%)   0.01 (  0%)   0.13 (  0%)
      0 kB (  0%)
 control dependences                :   0.02 (  0%)   0.00 (  0%)   0.04 (  0%)
      0 kB (  0%)
 out of ssa                         :  17.90 (  9%)   0.06 (  0%)  17.97 (  5%)
     47 kB (  0%)
 expand vars                        :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    444 kB (  0%)
 expand                             :   0.32 (  0%)   0.01 (  0%)   0.33 (  0%)
   5520 kB (  0%)
 forward prop                       :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
    246 kB (  0%)
 CSE                                :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
    145 kB (  0%)
 dead code elimination              :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
      0 kB (  0%)
 dead store elim1                   :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
    720 kB (  0%)
 dead store elim2                   :   0.03 (  0%)   0.00 (  0%)   0.04 (  0%)
   1070 kB (  0%)
 loop init                          :   0.01 (  0%)   0.00 (  0%)   0.04 (  0%)
     22 kB (  0%)
 CSE 2                              :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
     56 kB (  0%)
 branch prediction                  :   0.05 (  0%)   0.01 (  0%)   0.05 (  0%)
      0 kB (  0%)
 combiner                           :   0.07 (  0%)   0.00 (  0%)   0.06 (  0%)
    965 kB (  0%)
 integrated RA                      :   0.12 (  0%)   0.00 (  0%)   0.24 (  0%)
   2193 kB (  0%)
 LRA non-specific                   :   0.07 (  0%)   0.00 (  0%)   0.07 (  0%)
    818 kB (  0%)
 LRA virtuals elimination           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    460 kB (  0%)
 LRA reload inheritance             :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    132 kB (  0%)
 LRA create live ranges             :   0.18 (  0%)   0.00 (  0%)   0.18 (  0%)
     87 kB (  0%)
 LRA hard reg assignment            :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 LRA rematerialization              :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 reload CSE regs                    :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)
   1865 kB (  0%)
 load CSE after reload              :   0.11 (  0%)   0.00 (  0%)   0.11 (  0%)
    138 kB (  0%)
 ree                                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 thread pro- & epilogue             :   0.02 (  0%)   0.01 (  0%)   0.03 (  0%)
      6 kB (  0%)
 combine stack adjustments          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
     37 kB (  0%)
 peephole 2                         :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
    348 kB (  0%)
 hard reg cprop                     :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 scheduling 2                       :   0.13 (  0%)   0.00 (  0%)   0.13 (  0%)
    377 kB (  0%)
 reorder blocks                     :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)
    518 kB (  0%)
 shorten branches                   :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 final                              :   0.03 (  0%)   0.00 (  0%)   0.09 (  0%)
   1302 kB (  0%)
 straight-line strength reduction   :   0.24 (  0%)   0.00 (  0%)   0.24 (  0%)
      0 kB (  0%)
 initialize rtl                     :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)
      9 kB (  0%)
 rest of compilation                :   1.18 (  1%)   0.06 (  0%)   1.28 (  0%)
  82789 kB (  2%)
 remove unused locals               :   4.82 (  3%)   0.00 (  0%)   5.02 (  1%)
      0 kB (  0%)
 address taken                      :   1.88 (  1%)   0.00 (  0%)   1.91 (  1%)
      0 kB (  0%)
 TOTAL                              : 189.53         24.53        367.91       
5299338 kB

####################################################################################

As you can see, in each newer GCC version the performance is even more
degraded.

F.Y.I I also tried the same code using LLVM v8.0.0 which complied similar to
GCC 4.4.7.

Best regards,
--
Rogerio

Reply via email to