https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60243

--- Comment #27 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
profile_estimate issue is still here, inliner and early inliner issues seems
solved. Seems that ipa_profile just orders the nodes for propagation in wrong
way - we propagate from callers to callees while toposorter is for propagation
opoposite way.

operand_scan seems slow too.

Time variable                                   usr           sys          wall
              GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
   1237 kB (  0%)
 phase parsing                      :   6.63 (  9%)   6.77 ( 77%)  13.41 ( 17%)
 655497 kB ( 20%)
 phase opt and generate             :  64.47 ( 91%)   2.07 ( 23%)  66.57 ( 83%)
2603397 kB ( 80%)
 garbage collection                 :   0.64 (  1%)   0.00 (  0%)   0.65 (  1%)
      0 kB (  0%)
 dump files                         :   0.05 (  0%)   0.01 (  0%)   0.04 (  0%)
      0 kB (  0%)
 callgraph construction             :   0.91 (  1%)   0.01 (  0%)   0.83 (  1%)
 399235 kB ( 12%)
 callgraph optimization             :   0.37 (  1%)   0.00 (  0%)   0.43 (  1%)
      0 kB (  0%)
 callgraph functions expansion      :  15.98 ( 22%)   1.20 ( 14%)  17.18 ( 21%)
 297309 kB (  9%)
 callgraph ipa passes               :  40.57 ( 57%)   0.40 (  5%)  40.99 ( 51%)
 617751 kB ( 19%)
 ipa function summary               :   0.14 (  0%)   0.00 (  0%)   0.14 (  0%)
   1807 kB (  0%)
 ipa dead code removal              :   0.22 (  0%)   0.00 (  0%)   0.24 (  0%)
      0 kB (  0%)
 ipa cp                             :   0.97 (  1%)   0.03 (  0%)   1.03 (  1%)
 327514 kB ( 10%)
 ipa inlining heuristics            :   0.72 (  1%)   0.00 (  0%)   0.63 (  1%)
  84183 kB (  3%)
 ipa function splitting             :   0.02 (  0%)   0.00 (  0%)   0.05 (  0%)
      0 kB (  0%)
 ipa various optimizations          :   0.69 (  1%)   0.20 (  2%)   0.89 (  1%)
 128398 kB (  4%)
 ipa reference                      :   0.05 (  0%)   0.00 (  0%)   0.05 (  0%)
      0 kB (  0%)
 ipa profile                        :  18.24 ( 26%)   0.00 (  0%)  18.25 ( 23%)
      0 kB (  0%)
 ipa pure const                     :   0.45 (  1%)   0.00 (  0%)   0.46 (  1%)
      0 kB (  0%)
 ipa icf                            :   0.17 (  0%)   0.02 (  0%)   0.17 (  0%)
      0 kB (  0%)
 ipa SRA                            :   0.21 (  0%)   0.00 (  0%)   0.21 (  0%)
    102 kB (  0%)
 ipa free inline summary            :   0.03 (  0%)   0.00 (  0%)   0.04 (  0%)
      0 kB (  0%)
 cfg cleanup                        :   0.00 (  0%)   0.01 (  0%)   0.02 (  0%)
      0 kB (  0%)
 trivially dead code                :   0.12 (  0%)   0.03 (  0%)   0.12 (  0%)
      0 kB (  0%)
 df scan insns                      :   0.85 (  1%)   0.14 (  2%)   1.28 (  2%)
     46 kB (  0%)
 df multiple defs                   :   0.30 (  0%)   0.06 (  1%)   0.31 (  0%)
      0 kB (  0%)
 df reaching defs                   :   0.69 (  1%)   0.05 (  1%)   0.63 (  1%)
      0 kB (  0%)
 df live regs                       :   0.49 (  1%)   0.02 (  0%)   0.57 (  1%)
      0 kB (  0%)
 df live&initialized regs           :   0.19 (  0%)   0.01 (  0%)   0.12 (  0%)
      0 kB (  0%)
 df must-initialized regs           :   0.10 (  0%)   0.00 (  0%)   0.10 (  0%)
      0 kB (  0%)
 df use-def / def-use chains        :   0.44 (  1%)   0.05 (  1%)   0.40 (  1%)
      0 kB (  0%)
 df reg dead/unused notes           :   1.35 (  2%)   0.09 (  1%)   1.15 (  1%)
    747 kB (  0%) register information               :   0.16 (  0%)   0.00 ( 
0%)   0.18 (  0%)       0 kB (  0%)
 alias analysis                     :   0.16 (  0%)   0.00 (  0%)   0.11 (  0%)
    436 kB (  0%)
 alias stmt walking                 :   0.49 (  1%)   0.07 (  1%)   0.67 (  1%)
      0 kB (  0%)
 register scan                      :   0.04 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 rebuild jump labels                :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 preprocessing                      :   2.37 (  3%)   2.37 ( 27%)   4.49 (  6%)
 383477 kB ( 12%)
 lexical analysis                   :   1.88 (  3%)   2.13 ( 24%)   4.20 (  5%)
      0 kB (  0%)
 parser (global)                    :   0.01 (  0%)   0.01 (  0%)   0.03 (  0%)
   1442 kB (  0%)
 parser function body               :   2.19 (  3%)   2.26 ( 26%)   4.50 (  6%)
 270577 kB (  8%)
 early inlining heuristics          :   2.80 (  4%)   0.03 (  0%)   2.81 (  4%)
   3076 kB (  0%)
 inline parameters                  :   6.43 (  9%)   0.14 (  2%)   6.74 (  8%)
  31127 kB (  1%)
 integration                        :   0.17 (  0%)   0.00 (  0%)   0.08 (  0%)
   6789 kB (  0%)
 tree gimplify                      :   1.01 (  1%)   0.03 (  0%)   1.15 (  1%)
 610970 kB ( 19%)
 tree eh                            :   0.50 (  1%)   0.03 (  0%)   0.44 (  1%)
      0 kB (  0%)
 tree CFG construction              :   3.50 (  5%)   0.02 (  0%)   3.74 (  5%)
 628087 kB ( 19%)
 tree CFG cleanup                   :   0.69 (  1%)   0.03 (  0%)   0.67 (  1%)
      0 kB (  0%)
 tree tail merge                    :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 tree VRP                           :   0.09 (  0%)   0.03 (  0%)   0.15 (  0%)
   2241 kB (  0%)
 tree Early VRP                     :   0.06 (  0%)   0.00 (  0%)   0.08 (  0%)
   1047 kB (  0%)
 tree copy propagation              :   0.07 (  0%)   0.02 (  0%)   0.10 (  0%)
      0 kB (  0%)
 tree PTA                           :   0.40 (  1%)   0.02 (  0%)   0.37 (  0%)
     93 kB (  0%)
 tree SSA rewrite                   :   1.41 (  2%)   0.03 (  0%)   1.48 (  2%)
  90326 kB (  3%)
 tree SSA other                     :   0.15 (  0%)   0.00 (  0%)   0.13 (  0%)
    140 kB (  0%)
 tree SSA incremental               :   0.05 (  0%)   0.00 (  0%)   0.10 (  0%)
      0 kB (  0%)
 tree operand scan                  :   7.64 ( 11%)   0.26 (  3%)   7.95 ( 10%)
  95305 kB (  3%)
 dominator optimization             :   0.03 (  0%)   0.00 (  0%)   0.07 (  0%)
    155 kB (  0%)
 backwards jump threading           :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 isolate eroneous paths             :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 tree CCP                           :   0.10 (  0%)   0.00 (  0%)   0.15 (  0%)
      0 kB (  0%)
 tree PRE                           :   0.19 (  0%)   0.00 (  0%)   0.19 (  0%)
   1276 kB (  0%)
 tree FRE                           :   0.15 (  0%)   0.05 (  1%)   0.25 (  0%)
    701 kB (  0%)
 tree code sinking                  :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 tree linearize phis                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
   1042 kB (  0%)
 tree forward propagate             :   0.07 (  0%)   0.02 (  0%)   0.13 (  0%)
      0 kB (  0%)
 tree conservative DCE              :   0.47 (  1%)   0.13 (  1%)   0.52 (  1%)
      0 kB (  0%)
 tree aggressive DCE                :   0.23 (  0%)   0.03 (  0%)   0.23 (  0%)
   2090 kB (  0%)
 tree DSE                           :   0.03 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 PHI merge                          :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 gimple widening/fma detection      :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 tree strlen optimization           :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
   1042 kB (  0%)
 dominance computation              :   0.24 (  0%)   0.03 (  0%)   0.18 (  0%)
      0 kB (  0%)
 out of ssa                         :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
      0 kB (  0%)
 expand                             :   0.22 (  0%)   0.07 (  1%)   0.38 (  0%)
 128974 kB (  4%)
 post expand cleanups               :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
    303 kB (  0%)
 forward prop                       :   0.26 (  0%)   0.03 (  0%)   0.16 (  0%)
      0 kB (  0%)
 CSE                                :   0.12 (  0%)   0.07 (  1%)   0.17 (  0%)
      0 kB (  0%)
 dead code elimination              :   0.09 (  0%)   0.00 (  0%)   0.11 (  0%)
      0 kB (  0%)
 dead store elim1                   :   0.25 (  0%)   0.00 (  0%)   0.30 (  0%)
  11613 kB (  0%)
 dead store elim2                   :   0.30 (  0%)   0.00 (  0%)   0.33 (  0%)
  11613 kB (  0%)
 loop init                          :   0.01 (  0%)   0.00 (  0%)   0.03 (  0%)
   4103 kB (  0%)
 loop fini                          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 CPROP                              :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 CSE 2                              :   0.14 (  0%)   0.02 (  0%)   0.18 (  0%)
     23 kB (  0%)
 branch prediction                  :   0.11 (  0%)   0.00 (  0%)   0.04 (  0%)
    101 kB (  0%)
 combiner                           :   0.21 (  0%)   0.01 (  0%)   0.24 (  0%)
      0 kB (  0%)
 integrated RA                      :   1.22 (  2%)   0.02 (  0%)   1.38 (  2%)
  23989 kB (  1%)
 LRA non-specific                   :   0.41 (  1%)   0.00 (  0%)   0.44 (  1%)
     54 kB (  0%)
 LRA virtuals elimination           :   0.02 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 LRA reload inheritance             :   0.04 (  0%)   0.00 (  0%)   0.03 (  0%)
      0 kB (  0%)
 LRA hard reg assignment            :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 reload                             :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
      0 kB (  0%)
 reload CSE regs                    :   0.33 (  0%)   0.00 (  0%)   0.30 (  0%)
     46 kB (  0%)
 ree                                :   0.07 (  0%)   0.00 (  0%)   0.16 (  0%)
      0 kB (  0%)
 thread pro- & epilogue             :   0.61 (  1%)   0.00 (  0%)   0.33 (  0%)
    855 kB (  0%)
 peephole 2                         :   0.05 (  0%)   0.01 (  0%)   0.06 (  0%)
      0 kB (  0%)
 hard reg cprop                     :   0.11 (  0%)   0.00 (  0%)   0.15 (  0%)
      0 kB (  0%)
 scheduling 2                       :   2.64 (  4%)   0.00 (  0%)   2.58 (  3%)
    244 kB (  0%)
 machine dep reorg                  :   0.08 (  0%)   0.01 (  0%)   0.09 (  0%)
      0 kB (  0%)
 reorder blocks                     :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 shorten branches                   :   0.09 (  0%)   0.00 (  0%)   0.12 (  0%)
      0 kB (  0%)
 final                              :   0.51 (  1%)   0.00 (  0%)   0.45 (  1%)
   1105 kB (  0%)
 straight-line strength reduction   :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)
      0 kB (  0%)
 initialize rtl                     :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)
     12 kB (  0%)
 address lowering                   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
      0 kB (  0%)
 rest of compilation                :   0.78 (  1%)   0.10 (  1%)   0.90 (  1%)
   2365 kB (  0%)
 remove unused locals               :   0.04 (  0%)   0.00 (  0%)   0.05 (  0%)
      0 kB (  0%)
 address taken                      :   0.04 (  0%)   0.01 (  0%)   0.08 (  0%)
      0 kB (  0%)
TOTAL                              :  71.10          8.84         79.98       
3260140 kB

Reply via email to