https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116286
Bug ID: 116286 Summary: Compilation of nodejs/v8 v8_turboshaft.csa-optimize-phase.cc is slow Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org Target Milestone: --- Clang takes about the same time to build the non-preprocessed version of this, so maybe it's unavoidable. It's fine if there's nothing to be done here. On releases/gcc-13 with --enable-checking=release: ``` $ time /tmp/bisect-gcc-pfx/bin/g++ -c v8_turboshaft.csa-optimize-phase.ii -O2 -march=znver2 -std=gnu++20 -ftime-report -c -w Time variable usr sys wall GGC phase setup : 0.00 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 2151k ( 0%) phase parsing : 11.90 ( 25%) 2.33 ( 45%) 14.28 ( 27%) 1359M ( 38%) phase lang. deferred : 5.23 ( 11%) 0.66 ( 13%) 5.92 ( 11%) 556M ( 16%) phase opt and generate : 30.34 ( 64%) 2.23 ( 43%) 32.71 ( 62%) 1633M ( 46%) |name lookup : 4.31 ( 9%) 0.55 ( 11%) 4.76 ( 9%) 53M ( 2%) |overload resolution : 6.17 ( 13%) 0.74 ( 14%) 7.07 ( 13%) 525M ( 15%) garbage collection : 2.39 ( 5%) 0.00 ( 0%) 2.39 ( 5%) 0 ( 0%) dump files : 0.43 ( 1%) 0.04 ( 1%) 0.58 ( 1%) 0 ( 0%) callgraph construction : 0.79 ( 2%) 0.06 ( 1%) 0.69 ( 1%) 66M ( 2%) callgraph optimization : 0.82 ( 2%) 0.12 ( 2%) 0.97 ( 2%) 1360k ( 0%) callgraph functions expansion : 19.26 ( 41%) 0.80 ( 15%) 20.15 ( 38%) 703M ( 20%) callgraph ipa passes : 9.58 ( 20%) 1.25 ( 24%) 10.87 ( 21%) 630M ( 18%) ipa function summary : 0.13 ( 0%) 0.01 ( 0%) 0.17 ( 0%) 10201k ( 0%) ipa dead code removal : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 ( 0%) ipa inheritance graph : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 27k ( 0%) ipa cp : 0.17 ( 0%) 0.01 ( 0%) 0.15 ( 0%) 9702k ( 0%) ipa inlining heuristics : 0.34 ( 1%) 0.00 ( 0%) 0.38 ( 1%) 29M ( 1%) ipa function splitting : 0.04 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 1082k ( 0%) ipa comdats : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 ( 0%) ipa reference : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 ( 0%) ipa pure const : 0.15 ( 0%) 0.03 ( 1%) 0.22 ( 0%) 394k ( 0%) ipa icf : 0.11 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 17k ( 0%) ipa SRA : 0.11 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 5722k ( 0%) ipa free lang data : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 ( 0%) ipa free inline summary : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) ipa modref : 0.10 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 6211k ( 0%) cfg construction : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 1728k ( 0%) cfg cleanup : 0.24 ( 1%) 0.00 ( 0%) 0.23 ( 0%) 4876k ( 0%) trivially dead code : 0.03 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 0 ( 0%) df scan insns : 0.12 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 58k ( 0%) df reaching defs : 0.11 ( 0%) 0.00 ( 0%) 0.14 ( 0%) 0 ( 0%) df live regs : 0.53 ( 1%) 0.03 ( 1%) 0.54 ( 1%) 0 ( 0%) df live&initialized regs : 0.27 ( 1%) 0.00 ( 0%) 0.18 ( 0%) 0 ( 0%) df use-def / def-use chains : 0.07 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 ( 0%) df live reg subwords : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 ( 0%) df reg dead/unused notes : 0.28 ( 1%) 0.01 ( 0%) 0.30 ( 1%) 7528k ( 0%) register information : 0.03 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) alias analysis : 0.23 ( 0%) 0.01 ( 0%) 0.30 ( 1%) 20M ( 1%) alias stmt walking : 0.68 ( 1%) 0.09 ( 2%) 0.85 ( 2%) 5830k ( 0%) register scan : 0.04 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1225k ( 0%) rebuild jump labels : 0.05 ( 0%) 0.00 ( 0%) 0.10 ( 0%) 0 ( 0%) preprocessing : 0.41 ( 1%) 0.43 ( 8%) 0.77 ( 1%) 39M ( 1%) parser (global) : 1.23 ( 3%) 0.47 ( 9%) 1.75 ( 3%) 235M ( 7%) parser struct body : 1.45 ( 3%) 0.14 ( 3%) 1.54 ( 3%) 190M ( 5%) parser enumerator list : 0.03 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 4811k ( 0%) parser function body : 0.11 ( 0%) 0.02 ( 0%) 0.13 ( 0%) 6863k ( 0%) parser inl. func. body : 1.33 ( 3%) 0.18 ( 3%) 1.50 ( 3%) 129M ( 4%) parser inl. meth. body : 1.30 ( 3%) 0.25 ( 5%) 1.62 ( 3%) 138M ( 4%) template instantiation : 8.85 ( 19%) 1.46 ( 28%) 10.50 ( 20%) 1147M ( 32%) constant expression evaluation : 0.74 ( 2%) 0.03 ( 1%) 0.67 ( 1%) 17M ( 0%) constraint normalization : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 1058k ( 0%) constraint satisfaction : 0.06 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 5451k ( 0%) early inlining heuristics : 0.20 ( 0%) 0.05 ( 1%) 0.35 ( 1%) 38M ( 1%) inline parameters : 0.48 ( 1%) 0.08 ( 2%) 0.43 ( 1%) 38M ( 1%) integration : 2.16 ( 5%) 0.19 ( 4%) 2.26 ( 4%) 331M ( 9%) tree gimplify : 0.42 ( 1%) 0.06 ( 1%) 0.63 ( 1%) 107M ( 3%) tree eh : 0.11 ( 0%) 0.01 ( 0%) 0.16 ( 0%) 22M ( 1%) tree CFG construction : 0.13 ( 0%) 0.03 ( 1%) 0.10 ( 0%) 114M ( 3%) tree CFG cleanup : 0.76 ( 2%) 0.08 ( 2%) 0.92 ( 2%) 3217k ( 0%) tree tail merge : 0.00 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 3434k ( 0%) tree VRP : 0.95 ( 2%) 0.02 ( 0%) 0.80 ( 2%) 8691k ( 0%) tree Early VRP : 0.58 ( 1%) 0.05 ( 1%) 0.55 ( 1%) 27M ( 1%) tree copy propagation : 0.07 ( 0%) 0.01 ( 0%) 0.12 ( 0%) 1108k ( 0%) tree PTA : 1.14 ( 2%) 0.12 ( 2%) 1.19 ( 2%) 15M ( 0%) tree SSA other : 0.04 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1035k ( 0%) tree SSA rewrite : 0.17 ( 0%) 0.09 ( 2%) 0.25 ( 0%) 32M ( 1%) tree SSA incremental : 0.43 ( 1%) 0.02 ( 0%) 0.56 ( 1%) 35M ( 1%) tree operand scan : 0.52 ( 1%) 0.10 ( 2%) 0.56 ( 1%) 116M ( 3%) dominator optimization : 0.94 ( 2%) 0.03 ( 1%) 1.15 ( 2%) 19M ( 1%) backwards jump threading : 0.53 ( 1%) 0.05 ( 1%) 0.55 ( 1%) 7423k ( 0%) tree SRA : 0.47 ( 1%) 0.04 ( 1%) 0.52 ( 1%) 35M ( 1%) isolate eroneous paths : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 ( 0%) tree CCP : 0.91 ( 2%) 0.05 ( 1%) 0.91 ( 2%) 8212k ( 0%) tree split crit edges : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 2945k ( 0%) tree reassociation : 0.06 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 390k ( 0%) tree PRE : 0.75 ( 2%) 0.02 ( 0%) 0.55 ( 1%) 15M ( 0%) tree FRE : 0.95 ( 2%) 0.07 ( 1%) 0.98 ( 2%) 19M ( 1%) tree RPO VN : 0.05 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 634k ( 0%) tree code sinking : 0.06 ( 0%) 0.01 ( 0%) 0.10 ( 0%) 6147k ( 0%) tree linearize phis : 0.05 ( 0%) 0.01 ( 0%) 0.06 ( 0%) 4577k ( 0%) tree backward propagate : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 ( 0%) tree forward propagate : 0.31 ( 1%) 0.07 ( 1%) 0.45 ( 1%) 10M ( 0%) tree phiprop : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 86k ( 0%) tree conservative DCE : 0.14 ( 0%) 0.01 ( 0%) 0.19 ( 0%) 1309k ( 0%) tree aggressive DCE : 0.25 ( 1%) 0.03 ( 1%) 0.34 ( 1%) 29M ( 1%) tree DSE : 0.34 ( 1%) 0.02 ( 0%) 0.29 ( 1%) 6567k ( 0%) PHI merge : 0.00 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 606k ( 0%) tree loop optimization : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) tree loop invariant motion : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 36k ( 0%) tree canonical iv : 0.03 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 1456k ( 0%) complete unrolling : 0.17 ( 0%) 0.01 ( 0%) 0.18 ( 0%) 13M ( 0%) tree vectorization : 0.04 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 1300k ( 0%) tree slp vectorization : 0.41 ( 1%) 0.04 ( 1%) 0.43 ( 1%) 66M ( 2%) tree loop distribution : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 1200k ( 0%) tree iv optimization : 0.09 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 4989k ( 0%) predictive commoning : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 601k ( 0%) tree copy headers : 0.05 ( 0%) 0.01 ( 0%) 0.04 ( 0%) 2159k ( 0%) tree SSA uncprop : 0.03 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 25k ( 0%) tree NRV optimization : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 23k ( 0%) tree switch conversion : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 8360 ( 0%) tree switch lowering : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 943k ( 0%) gimple widening/fma detection : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1120 ( 0%) tree strlen optimization : 0.03 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 1475k ( 0%) tree modref : 0.14 ( 0%) 0.03 ( 1%) 0.18 ( 0%) 9907k ( 0%) dominance frontiers : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 ( 0%) dominance computation : 0.52 ( 1%) 0.02 ( 0%) 0.71 ( 1%) 0 ( 0%) control dependences : 0.03 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 ( 0%) out of ssa : 0.04 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 41k ( 0%) expand vars : 0.09 ( 0%) 0.00 ( 0%) 0.10 ( 0%) 8179k ( 0%) expand : 0.39 ( 1%) 0.03 ( 1%) 0.40 ( 1%) 53M ( 2%) post expand cleanups : 0.01 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 1904k ( 0%) varconst : 0.02 ( 0%) 0.01 ( 0%) 0.05 ( 0%) 76k ( 0%) lower subreg : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 161k ( 0%) jump : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) forward prop : 0.15 ( 0%) 0.00 ( 0%) 0.24 ( 0%) 1134k ( 0%) CSE : 0.38 ( 1%) 0.02 ( 0%) 0.31 ( 1%) 1770k ( 0%) dead code elimination : 0.07 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 ( 0%) dead store elim1 : 0.18 ( 0%) 0.01 ( 0%) 0.09 ( 0%) 6320k ( 0%) dead store elim2 : 0.13 ( 0%) 0.00 ( 0%) 0.15 ( 0%) 7399k ( 0%) loop analysis : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) loop init : 0.36 ( 1%) 0.05 ( 1%) 0.36 ( 1%) 40M ( 1%) loop invariant motion : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 17k ( 0%) loop unrolling : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 1038k ( 0%) loop fini : 0.09 ( 0%) 0.01 ( 0%) 0.09 ( 0%) 15k ( 0%) CPROP : 0.25 ( 1%) 0.01 ( 0%) 0.19 ( 0%) 9876k ( 0%) PRE : 0.20 ( 0%) 0.00 ( 0%) 0.13 ( 0%) 322k ( 0%) CSE 2 : 0.35 ( 1%) 0.00 ( 0%) 0.23 ( 0%) 966k ( 0%) branch prediction : 0.12 ( 0%) 0.04 ( 1%) 0.17 ( 0%) 5989k ( 0%) combiner : 0.90 ( 2%) 0.03 ( 1%) 0.92 ( 2%) 32M ( 1%) if-conversion : 0.04 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 921k ( 0%) integrated RA : 0.85 ( 2%) 0.05 ( 1%) 0.94 ( 2%) 71M ( 2%) LRA non-specific : 0.24 ( 1%) 0.00 ( 0%) 0.33 ( 1%) 3642k ( 0%) LRA virtuals elimination : 0.05 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 1655k ( 0%) LRA reload inheritance : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 596k ( 0%) LRA create live ranges : 0.08 ( 0%) 0.00 ( 0%) 0.12 ( 0%) 419k ( 0%) LRA hard reg assignment : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 ( 0%) LRA rematerialization : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 1520 ( 0%) reload : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 33k ( 0%) reload CSE regs : 0.38 ( 1%) 0.01 ( 0%) 0.33 ( 1%) 8161k ( 0%) ree : 0.01 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 274k ( 0%) thread pro- & epilogue : 0.05 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 4809k ( 0%) if-conversion 2 : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 30k ( 0%) combine stack adjustments : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1392 ( 0%) peephole 2 : 0.02 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 413k ( 0%) hard reg cprop : 0.12 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 72k ( 0%) scheduling 2 : 0.49 ( 1%) 0.00 ( 0%) 0.64 ( 1%) 3694k ( 0%) machine dep reorg : 0.06 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 918k ( 0%) reorder blocks : 0.03 ( 0%) 0.01 ( 0%) 0.05 ( 0%) 3306k ( 0%) shorten branches : 0.04 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 0 ( 0%) reg stack : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 12k ( 0%) final : 0.27 ( 1%) 0.00 ( 0%) 0.18 ( 0%) 11M ( 0%) symout : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 ( 0%) tree if-combine : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 225k ( 0%) if to switch conversion : 0.01 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 28k ( 0%) straight-line strength reduction : 0.01 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 213k ( 0%) store merging : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 897k ( 0%) address lowering : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1656 ( 0%) access analysis : 0.12 ( 0%) 0.00 ( 0%) 0.16 ( 0%) 438k ( 0%) early local passes : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 ( 0%) rest of compilation : 0.43 ( 1%) 0.03 ( 1%) 0.63 ( 1%) 6967k ( 0%) remove unused locals : 0.26 ( 1%) 0.05 ( 1%) 0.30 ( 1%) 67k ( 0%) address taken : 0.21 ( 0%) 0.03 ( 1%) 0.11 ( 0%) 768 ( 0%) rebuild frequencies : 0.03 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 150k ( 0%) repair loop structures : 0.04 ( 0%) 0.01 ( 0%) 0.01 ( 0%) 6040 ( 0%) TOTAL : 47.47 5.22 52.91 3551M real 0m53.443s user 0m47.685s sys 0m5.515s ```