https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155
--- Comment #54 from Rogério de Souza Moraes <rogerio.souza at gmail dot com> --- Hi everyone, we continue to get performance degradation as we use a new compiler version to build the source code 'testcase-v1.c" that is available in the attachment above. Could you please check if the test case that I attached reproduces the same issue, or it is a different issue? Below are the results that I got: # GCC 4.4.7 -bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign -mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o Execution times (seconds) callgraph construction: 0.02 ( 0%) usr 0.01 ( 4%) sys 0.03 ( 0%) wall 2535 kB ( 1%) ggc callgraph optimization: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 11 kB ( 0%) ggc cfg cleanup : 0.05 ( 1%) usr 0.00 ( 0%) sys 0.07 ( 1%) wall 151 kB ( 0%) ggc trivially dead code : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.40 ( 4%) usr 0.01 ( 4%) sys 0.40 ( 4%) wall 0 kB ( 0%) ggc df live regs : 0.33 ( 3%) usr 0.00 ( 0%) sys 0.37 ( 4%) wall 0 kB ( 0%) ggc df live&initialized regs: 0.08 ( 1%) usr 0.00 ( 0%) sys 0.12 ( 1%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.17 ( 2%) usr 0.00 ( 0%) sys 0.16 ( 2%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.25 ( 3%) usr 0.00 ( 0%) sys 0.22 ( 2%) wall 2519 kB ( 1%) ggc register information : 0.13 ( 1%) usr 0.00 ( 0%) sys 0.11 ( 1%) wall 0 kB ( 0%) ggc alias analysis : 0.06 ( 1%) usr 0.00 ( 0%) sys 0.08 ( 1%) wall 1952 kB ( 1%) ggc register scan : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.06 ( 1%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc preprocessing : 0.02 ( 0%) usr 0.02 ( 7%) sys 0.02 ( 0%) wall 466 kB ( 0%) ggc lexical analysis : 0.03 ( 0%) usr 0.02 ( 7%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc parser : 0.06 ( 1%) usr 0.07 (26%) sys 0.19 ( 2%) wall 18164 kB ( 9%) ggc inline heuristics : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 0.07 ( 1%) usr 0.00 ( 0%) sys 0.06 ( 1%) wall 19793 kB (10%) ggc tree CFG construction : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 4445 kB ( 2%) ggc tree CFG cleanup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 1%) wall 1209 kB ( 1%) ggc tree VRP : 0.20 ( 2%) usr 0.00 ( 0%) sys 0.20 ( 2%) wall 12984 kB ( 6%) ggc tree copy propagation : 0.10 ( 1%) usr 0.00 ( 0%) sys 0.05 ( 1%) wall 9 kB ( 0%) ggc tree find ref. vars : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1190 kB ( 1%) ggc tree PTA : 0.05 ( 1%) usr 0.00 ( 0%) sys 0.06 ( 1%) wall 33 kB ( 0%) ggc tree alias analysis : 0.04 ( 0%) usr 0.01 ( 4%) sys 0.03 ( 0%) wall 308 kB ( 0%) ggc tree flow sensitive alias: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree PHI insertion : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 1349 kB ( 1%) ggc tree SSA rewrite : 0.05 ( 1%) usr 0.00 ( 0%) sys 0.08 ( 1%) wall 16607 kB ( 8%) ggc tree SSA other : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.11 ( 1%) usr 0.01 ( 4%) sys 0.06 ( 1%) wall 5142 kB ( 3%) ggc tree operand scan : 0.11 ( 1%) usr 0.03 (11%) sys 0.12 ( 1%) wall 10620 kB ( 5%) ggc dominator optimization: 0.08 ( 1%) usr 0.00 ( 0%) sys 0.11 ( 1%) wall 2130 kB ( 1%) ggc tree CCP : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 1%) wall 6 kB ( 0%) ggc tree split crit edges : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 6801 kB ( 3%) ggc tree reassociation : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 2 kB ( 0%) ggc tree PRE : 2.94 (31%) usr 0.02 ( 7%) sys 2.99 (30%) wall 4955 kB ( 2%) ggc tree FRE : 0.16 ( 2%) usr 0.01 ( 4%) sys 0.17 ( 2%) wall 1933 kB ( 1%) ggc tree code sinking : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1 kB ( 0%) ggc tree forward propagate: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc tree aggressive DCE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc tree DSE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 228 kB ( 0%) ggc PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc complete unrolling : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 2 kB ( 0%) ggc tree SSA uncprop : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA to normal : 0.07 ( 1%) usr 0.00 ( 0%) sys 0.09 ( 1%) wall 15109 kB ( 7%) ggc tree rename SSA copies: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree switch initialization conversion: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc dominance frontiers : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.05 ( 1%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc expand : 0.49 ( 5%) usr 0.02 ( 7%) sys 0.50 ( 5%) wall 34138 kB (17%) ggc jump : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc forward prop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1556 kB ( 1%) ggc CSE : 0.13 ( 1%) usr 0.01 ( 4%) sys 0.16 ( 2%) wall 606 kB ( 0%) ggc dead code elimination : 0.09 ( 1%) usr 0.00 ( 0%) sys 0.07 ( 1%) wall 0 kB ( 0%) ggc dead store elim1 : 0.10 ( 1%) usr 0.00 ( 0%) sys 0.09 ( 1%) wall 1706 kB ( 1%) ggc dead store elim2 : 0.12 ( 1%) usr 0.00 ( 0%) sys 0.12 ( 1%) wall 4207 kB ( 2%) ggc CSE 2 : 0.11 ( 1%) usr 0.00 ( 0%) sys 0.08 ( 1%) wall 303 kB ( 0%) ggc branch prediction : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 2 kB ( 0%) ggc combiner : 0.15 ( 2%) usr 0.01 ( 4%) sys 0.14 ( 1%) wall 4423 kB ( 2%) ggc if-conversion : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 154 kB ( 0%) ggc regmove : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc integrated RA : 0.90 ( 9%) usr 0.00 ( 0%) sys 0.89 ( 9%) wall 1788 kB ( 1%) ggc reload : 0.39 ( 4%) usr 0.00 ( 0%) sys 0.38 ( 4%) wall 6285 kB ( 3%) ggc reload CSE regs : 0.20 ( 2%) usr 0.01 ( 4%) sys 0.20 ( 2%) wall 8333 kB ( 4%) ggc load CSE after reload : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc thread pro- & epilogue: 0.06 ( 1%) usr 0.00 ( 0%) sys 0.06 ( 1%) wall 13 kB ( 0%) ggc peephole 2 : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc rename registers : 0.06 ( 1%) usr 0.00 ( 0%) sys 0.08 ( 1%) wall 75 kB ( 0%) ggc scheduling 2 : 0.35 ( 4%) usr 0.01 ( 4%) sys 0.35 ( 4%) wall 0 kB ( 0%) ggc machine dep reorg : 0.05 ( 1%) usr 0.00 ( 0%) sys 0.06 ( 1%) wall 301 kB ( 0%) ggc reorder blocks : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 1%) wall 3708 kB ( 2%) ggc final : 0.11 ( 1%) usr 0.00 ( 0%) sys 0.13 ( 1%) wall 3977 kB ( 2%) ggc TOTAL : 9.48 0.27 9.83 203289 kB #################################################################################### #GCC 6.3.1 bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign -mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o Execution times (seconds) phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 892 kB ( 0%) ggc phase parsing : 0.21 ( 0%) usr 0.09 ( 3%) sys 0.32 ( 0%) wall 20284 kB ( 1%) ggc phase opt and generate : 120.50 (100%) usr 3.43 (97%) sys 125.05 (100%) wall 3091168 kB (99%) ggc garbage collection : 2.50 ( 2%) usr 0.02 ( 1%) sys 2.90 ( 2%) wall 0 kB ( 0%) ggc callgraph construction : 1.81 ( 1%) usr 0.08 ( 2%) sys 2.33 ( 2%) wall 8824 kB ( 0%) ggc callgraph optimization : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 8 kB ( 0%) ggc ipa dead code removal : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc ipa cp : 0.99 ( 1%) usr 0.00 ( 0%) sys 0.99 ( 1%) wall 512 kB ( 0%) ggc ipa inlining heuristics : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 9 kB ( 0%) ggc ipa profile : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc ipa icf : 5.44 ( 5%) usr 0.00 ( 0%) sys 5.44 ( 4%) wall 26 kB ( 0%) ggc cfg cleanup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 18 kB ( 0%) ggc trivially dead code : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc df scan insns : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc df multiple defs : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.09 ( 0%) usr 0.01 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc df live regs : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall 0 kB ( 0%) ggc df live&initialized regs: 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc df must-initialized regs: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 665 kB ( 0%) ggc register information : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 1090 kB ( 0%) ggc alias stmt walking : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 1 kB ( 0%) ggc rebuild jump labels : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc preprocessing : 0.03 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 3645 kB ( 0%) ggc lexical analysis : 0.07 ( 0%) usr 0.03 ( 1%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc parser (global) : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 643 kB ( 0%) ggc parser function body : 0.11 ( 0%) usr 0.04 ( 1%) sys 0.19 ( 0%) wall 15822 kB ( 1%) ggc parser inl. func. body : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 110 kB ( 0%) ggc inline parameters : 0.13 ( 0%) usr 0.02 ( 1%) sys 0.09 ( 0%) wall 1530 kB ( 0%) ggc tree gimplify : 0.06 ( 0%) usr 0.02 ( 1%) sys 0.06 ( 0%) wall 12544 kB ( 0%) ggc tree eh : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 10956 kB ( 0%) ggc tree CFG cleanup : 0.45 ( 0%) usr 0.01 ( 0%) sys 0.46 ( 0%) wall 0 kB ( 0%) ggc tree VRP : 25.14 (21%) usr 0.56 (16%) sys 25.73 (21%) wall 115035 kB ( 4%) ggc tree copy propagation : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%) wall 0 kB ( 0%) ggc tree PTA : 18.90 (16%) usr 0.63 (18%) sys 19.58 (16%) wall 488 kB ( 0%) ggc tree PHI insertion : 3.90 ( 3%) usr 1.67 (47%) sys 5.58 ( 4%) wall 2912441 kB (94%) ggc tree SSA rewrite : 12.59 (10%) usr 0.01 ( 0%) sys 12.62 (10%) wall 4709 kB ( 0%) ggc tree SSA other : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 94 kB ( 0%) ggc tree operand scan : 0.04 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 3523 kB ( 0%) ggc dominator optimization : 1.49 ( 1%) usr 0.00 ( 0%) sys 1.50 ( 1%) wall 378 kB ( 0%) ggc isolate eroneous paths : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc tree CCP : 9.87 ( 8%) usr 0.03 ( 1%) sys 9.94 ( 8%) wall 12 kB ( 0%) ggc tree PHI const/copy prop: 0.26 ( 0%) usr 0.00 ( 0%) sys 0.27 ( 0%) wall 0 kB ( 0%) ggc tree split crit edges : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.48 ( 0%) wall 568 kB ( 0%) ggc tree reassociation : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree PRE : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 6 kB ( 0%) ggc tree FRE : 1.47 ( 1%) usr 0.00 ( 0%) sys 1.48 ( 1%) wall 6 kB ( 0%) ggc tree backward propagate : 11.39 ( 9%) usr 0.00 ( 0%) sys 11.40 ( 9%) wall 0 kB ( 0%) ggc tree forward propagate : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc tree phiprop : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.78 ( 1%) usr 0.01 ( 0%) sys 0.80 ( 1%) wall 0 kB ( 0%) ggc tree aggressive DCE : 2.56 ( 2%) usr 0.00 ( 0%) sys 2.53 ( 2%) wall 12 kB ( 0%) ggc tree DSE : 0.67 ( 1%) usr 0.00 ( 0%) sys 0.68 ( 1%) wall 0 kB ( 0%) ggc PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree slp vectorization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 283 kB ( 0%) ggc tree SSA uncprop : 0.83 ( 1%) usr 0.02 ( 1%) sys 0.85 ( 1%) wall 0 kB ( 0%) ggc tree strlen optimization: 0.13 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.05 ( 0%) usr 0.01 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc out of ssa : 10.30 ( 9%) usr 0.24 ( 7%) sys 10.60 ( 8%) wall 47 kB ( 0%) ggc expand vars : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 428 kB ( 0%) ggc expand : 0.22 ( 0%) usr 0.02 ( 1%) sys 0.26 ( 0%) wall 5452 kB ( 0%) ggc lower subreg : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc forward prop : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 246 kB ( 0%) ggc CSE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 145 kB ( 0%) ggc dead code elimination : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 0.03 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall 720 kB ( 0%) ggc dead store elim2 : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 1042 kB ( 0%) ggc loop init : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 17 kB ( 0%) ggc CSE 2 : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 56 kB ( 0%) ggc branch prediction : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc combiner : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 965 kB ( 0%) ggc integrated RA : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 2130 kB ( 0%) ggc LRA non-specific : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 746 kB ( 0%) ggc LRA virtuals elimination: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 461 kB ( 0%) ggc LRA reload inheritance : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 132 kB ( 0%) ggc LRA create live ranges : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 87 kB ( 0%) ggc LRA hard reg assignment : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc LRA rematerialization : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc reload CSE regs : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 1865 kB ( 0%) ggc load CSE after reload : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 138 kB ( 0%) ggc ree : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc thread pro- & epilogue : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 4 kB ( 0%) ggc if-conversion 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc combine stack adjustments: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 37 kB ( 0%) ggc peephole 2 : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 348 kB ( 0%) ggc hard reg cprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc scheduling 2 : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 320 kB ( 0%) ggc reorder blocks : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 528 kB ( 0%) ggc shorten branches : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc final : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 1302 kB ( 0%) ggc straight-line strength reduction: 0.14 ( 0%) usr 0.01 ( 0%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc rest of compilation : 0.12 ( 0%) usr 0.01 ( 0%) sys 0.16 ( 0%) wall 33 kB ( 0%) ggc remove unused locals : 3.42 ( 3%) usr 0.01 ( 0%) sys 3.46 ( 3%) wall 0 kB ( 0%) ggc address taken : 1.19 ( 1%) usr 0.01 ( 0%) sys 1.21 ( 1%) wall 0 kB ( 0%) ggc unaccounted todo : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 40 kB ( 0%) ggc TOTAL : 120.71 3.52 125.38 3112354 kB #################################################################################### # GCC 8.3.1 bash-4.1$ gcc -ftime-report -m32 -w -c -O3 -pedantic -fwrapv -mstackrealign -mpreferred-stack-boundary=4 testcase-v1.c -o testcase-v1.o Time variable usr sys wall GGC phase setup : 0.00 ( 0%) 0.01 ( 0%) 0.01 ( 0%) 1047 kB ( 0%) phase parsing : 0.21 ( 0%) 0.08 ( 0%) 0.32 ( 0%) 20551 kB ( 0%) phase opt and generate : 189.32 (100%) 24.44 (100%) 367.57 (100%) 5277730 kB (100%) garbage collection : 7.52 ( 4%) 2.16 ( 9%) 32.80 ( 9%) 0 kB ( 0%) dump files : 0.00 ( 0%) 0.01 ( 0%) 0.02 ( 0%) 0 kB ( 0%) callgraph construction : 4.62 ( 2%) 4.29 ( 17%) 59.37 ( 16%) 9442 kB ( 0%) callgraph optimization : 0.28 ( 0%) 0.04 ( 0%) 0.60 ( 0%) 0 kB ( 0%) ipa function summary : 0.13 ( 0%) 0.22 ( 1%) 2.63 ( 1%) 11 kB ( 0%) ipa dead code removal : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) ipa cp : 2.11 ( 1%) 2.48 ( 10%) 30.08 ( 8%) 2 kB ( 0%) ipa inlining heuristics : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) ipa pure const : 0.02 ( 0%) 0.01 ( 0%) 0.05 ( 0%) 0 kB ( 0%) ipa icf : 12.01 ( 6%) 4.13 ( 17%) 60.34 ( 16%) 29 kB ( 0%) ipa free inline summary : 0.00 ( 0%) 0.01 ( 0%) 0.07 ( 0%) 0 kB ( 0%) cfg cleanup : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 18 kB ( 0%) trivially dead code : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) df scan insns : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) df multiple defs : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 kB ( 0%) df reaching defs : 0.08 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 0 kB ( 0%) df live regs : 0.11 ( 0%) 0.00 ( 0%) 0.12 ( 0%) 0 kB ( 0%) df live&initialized regs : 0.02 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) df must-initialized regs : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) df use-def / def-use chains : 0.05 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) df reg dead/unused notes : 0.11 ( 0%) 0.00 ( 0%) 0.12 ( 0%) 684 kB ( 0%) register information : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) alias analysis : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 1090 kB ( 0%) alias stmt walking : 0.13 ( 0%) 0.02 ( 0%) 0.36 ( 0%) 1 kB ( 0%) register scan : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 140 kB ( 0%) rebuild jump labels : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) preprocessing : 0.04 ( 0%) 0.01 ( 0%) 0.11 ( 0%) 3722 kB ( 0%) lexical analysis : 0.04 ( 0%) 0.02 ( 0%) 0.09 ( 0%) 0 kB ( 0%) parser (global) : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 716 kB ( 0%) parser struct body : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 61 kB ( 0%) parser function body : 0.13 ( 0%) 0.05 ( 0%) 0.10 ( 0%) 15942 kB ( 0%) early inlining heuristics : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) inline parameters : 0.16 ( 0%) 0.19 ( 1%) 0.47 ( 0%) 1538 kB ( 0%) tree gimplify : 0.06 ( 0%) 0.01 ( 0%) 0.07 ( 0%) 14496 kB ( 0%) tree eh : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 kB ( 0%) tree CFG construction : 0.04 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 11884 kB ( 0%) tree CFG cleanup : 0.64 ( 0%) 0.00 ( 0%) 0.65 ( 0%) 0 kB ( 0%) tree tail merge : 1.50 ( 1%) 0.00 ( 0%) 1.50 ( 0%) 0 kB ( 0%) tree VRP : 9.73 ( 5%) 2.02 ( 8%) 12.15 ( 3%) 165762 kB ( 3%) tree Early VRP : 13.07 ( 7%) 0.23 ( 1%) 13.32 ( 4%) 496545 kB ( 9%) tree copy propagation : 0.88 ( 0%) 0.00 ( 0%) 0.89 ( 0%) 0 kB ( 0%) tree PTA : 28.88 ( 15%) 4.22 ( 17%) 38.06 ( 10%) 46436 kB ( 1%) tree PHI insertion : 5.78 ( 3%) 1.96 ( 8%) 7.83 ( 2%) 4254415 kB ( 80%) tree SSA rewrite : 15.12 ( 8%) 0.02 ( 0%) 15.13 ( 4%) 3780 kB ( 0%) tree SSA other : 0.21 ( 0%) 0.16 ( 1%) 0.34 ( 0%) 0 kB ( 0%) tree SSA incremental : 0.14 ( 0%) 0.00 ( 0%) 0.16 ( 0%) 94 kB ( 0%) tree operand scan : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 3619 kB ( 0%) dominator optimization : 5.51 ( 3%) 0.10 ( 0%) 5.66 ( 2%) 165892 kB ( 3%) backwards jump threading : 0.01 ( 0%) 0.01 ( 0%) 0.01 ( 0%) 0 kB ( 0%) isolate eroneous paths : 0.10 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 0 kB ( 0%) tree CCP : 14.39 ( 8%) 1.75 ( 7%) 16.89 ( 5%) 75 kB ( 0%) tree PHI const/copy prop : 0.43 ( 0%) 0.00 ( 0%) 0.43 ( 0%) 0 kB ( 0%) tree split crit edges : 0.40 ( 0%) 0.01 ( 0%) 0.44 ( 0%) 274 kB ( 0%) tree reassociation : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) tree PRE : 3.41 ( 2%) 0.14 ( 1%) 3.55 ( 1%) 281 kB ( 0%) tree FRE : 20.53 ( 11%) 0.04 ( 0%) 20.61 ( 6%) 5 kB ( 0%) tree linearize phis : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 6 kB ( 0%) tree backward propagate : 1.41 ( 1%) 0.04 ( 0%) 1.92 ( 1%) 0 kB ( 0%) tree forward propagate : 3.69 ( 2%) 0.00 ( 0%) 3.70 ( 1%) 0 kB ( 0%) tree phiprop : 0.10 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 0 kB ( 0%) tree conservative DCE : 1.63 ( 1%) 0.00 ( 0%) 1.66 ( 0%) 0 kB ( 0%) tree aggressive DCE : 3.75 ( 2%) 0.01 ( 0%) 3.77 ( 1%) 12 kB ( 0%) tree DSE : 1.09 ( 1%) 0.00 ( 0%) 1.07 ( 0%) 0 kB ( 0%) PHI merge : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) tree slp vectorization : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 747 kB ( 0%) tree SSA uncprop : 1.44 ( 1%) 0.00 ( 0%) 1.44 ( 0%) 0 kB ( 0%) gimple widening/fma detection : 0.13 ( 0%) 0.00 ( 0%) 0.13 ( 0%) 0 kB ( 0%) tree strlen optimization : 0.19 ( 0%) 0.00 ( 0%) 0.19 ( 0%) 0 kB ( 0%) dominance computation : 0.03 ( 0%) 0.01 ( 0%) 0.13 ( 0%) 0 kB ( 0%) control dependences : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) out of ssa : 17.90 ( 9%) 0.06 ( 0%) 17.97 ( 5%) 47 kB ( 0%) expand vars : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 444 kB ( 0%) expand : 0.32 ( 0%) 0.01 ( 0%) 0.33 ( 0%) 5520 kB ( 0%) forward prop : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 246 kB ( 0%) CSE : 0.02 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 145 kB ( 0%) dead code elimination : 0.02 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) dead store elim1 : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 720 kB ( 0%) dead store elim2 : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 1070 kB ( 0%) loop init : 0.01 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 22 kB ( 0%) CSE 2 : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 56 kB ( 0%) branch prediction : 0.05 ( 0%) 0.01 ( 0%) 0.05 ( 0%) 0 kB ( 0%) combiner : 0.07 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 965 kB ( 0%) integrated RA : 0.12 ( 0%) 0.00 ( 0%) 0.24 ( 0%) 2193 kB ( 0%) LRA non-specific : 0.07 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 818 kB ( 0%) LRA virtuals elimination : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 460 kB ( 0%) LRA reload inheritance : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 132 kB ( 0%) LRA create live ranges : 0.18 ( 0%) 0.00 ( 0%) 0.18 ( 0%) 87 kB ( 0%) LRA hard reg assignment : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) LRA rematerialization : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) reload CSE regs : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 1865 kB ( 0%) load CSE after reload : 0.11 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 138 kB ( 0%) ree : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) thread pro- & epilogue : 0.02 ( 0%) 0.01 ( 0%) 0.03 ( 0%) 6 kB ( 0%) combine stack adjustments : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 37 kB ( 0%) peephole 2 : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 348 kB ( 0%) hard reg cprop : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) scheduling 2 : 0.13 ( 0%) 0.00 ( 0%) 0.13 ( 0%) 377 kB ( 0%) reorder blocks : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 518 kB ( 0%) shorten branches : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) final : 0.03 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 1302 kB ( 0%) straight-line strength reduction : 0.24 ( 0%) 0.00 ( 0%) 0.24 ( 0%) 0 kB ( 0%) initialize rtl : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 9 kB ( 0%) rest of compilation : 1.18 ( 1%) 0.06 ( 0%) 1.28 ( 0%) 82789 kB ( 2%) remove unused locals : 4.82 ( 3%) 0.00 ( 0%) 5.02 ( 1%) 0 kB ( 0%) address taken : 1.88 ( 1%) 0.00 ( 0%) 1.91 ( 1%) 0 kB ( 0%) TOTAL : 189.53 24.53 367.91 5299338 kB #################################################################################### As you can see, in each newer GCC version the performance is even more degraded. F.Y.I I also tried the same code using LLVM v8.0.0 which complied similar to GCC 4.4.7. Best regards, -- Rogerio