Hi, as the bug report tells us one speed problem is loop header copying, in particular the update_ssa call that is done for each and every copied loop header but touches all blocks in a function.
Now, one idea was to use an optimized update_ssa that works only on the relevant subset of blocks (it's dominance frontiers that are the problem). I've experimented with the original formulation of frontiers as per Cytron which allows to calculate the domfrontier of one basic block lazily. The end result was none, no speedup, no slowdown. I haven't investigated, but I guess the problem is that too often most of the blocks are relevant for most of the header copies, either because of virtual ops or because calculating the domfrontier of a block needs all domfrontiers of all dominated childs (i.e. the domfrontier of the entry needs domfrontiers of all blocks). So, no cake there. But actually there's no reason that we need to keep SSA form uptodate during the multiple header copyings. We use gimple_duplicate_sese_region to do the copying which updates the SSA web before returning (actually loop header copying is the only caller of it). The next thing done is just another call to gimple_duplicate_sese_region to copy some other BBs, then some split edges, repeat from start. We can just as well defer the whole SSA web updating to after we've duplicated everything we want. That's what this patch does. Time for various things on the testcase (with -O1): without with patch tree SSA rewrite 26.2s 4.8s tree SSA incremental 21.7s 4.6s dominance computation 15.0s 4.2s dominance frontiers 25.6s 6.7s TOTAL 135.6s 67.8s Regstrapped on x86_64-linux, no regressions. Okay for trunk? Ciao, Michael. ------------------ PR tree-optimization/46590 * tree-cfg.c (gimple_duplicate_sese_region): Don't update SSA web here ... * tree-ssa-loop-ch.c (copy_loop_headers): ... but here. Index: tree-cfg.c =================================================================== --- tree-cfg.c (revision 190803) +++ tree-cfg.c (working copy) @@ -5530,9 +5530,10 @@ add_phi_args_after_copy (basic_block *re important exit edge EXIT. By important we mean that no SSA name defined inside region is live over the other exit edges of the region. All entry edges to the region must go to ENTRY->dest. The edge ENTRY is redirected - to the duplicate of the region. SSA form, dominance and loop information - is updated. The new basic blocks are stored to REGION_COPY in the same - order as they had in REGION, provided that REGION_COPY is not NULL. + to the duplicate of the region. Dominance and loop information is + updated, but not the SSA web. The new basic blocks are stored to + REGION_COPY in the same order as they had in REGION, provided that + REGION_COPY is not NULL. The function returns false if it is unable to copy the region, true otherwise. */ @@ -5593,8 +5594,6 @@ gimple_duplicate_sese_region (edge entry free_region_copy = true; } - gcc_assert (!need_ssa_update_p (cfun)); - /* Record blocks outside the region that are dominated by something inside. */ doms = NULL; @@ -5663,9 +5662,6 @@ gimple_duplicate_sese_region (edge entry /* Add the other PHI node arguments. */ add_phi_args_after_copy (region_copy, n_region, NULL); - /* Update the SSA web. */ - update_ssa (TODO_update_ssa); - if (free_region_copy) free (region_copy); Index: tree-ssa-loop-ch.c =================================================================== --- tree-ssa-loop-ch.c (revision 190803) +++ tree-ssa-loop-ch.c (working copy) @@ -241,6 +241,7 @@ copy_loop_headers (void) split_edge (loop_latch_edge (loop)); } + update_ssa (TODO_update_ssa); free (bbs); free (copied_bbs);