On Sun, Sep 2, 2012 at 9:35 PM, Michael Matz <m...@suse.de> wrote: > Hi, > > as the bug report tells us one speed problem is loop header copying, in > particular the update_ssa call that is done for each and every copied loop > header but touches all blocks in a function. > > Now, one idea was to use an optimized update_ssa that works only on the > relevant subset of blocks (it's dominance frontiers that are the problem). > I've experimented with the original formulation of frontiers as per Cytron > which allows to calculate the domfrontier of one basic block lazily. The > end result was none, no speedup, no slowdown. I haven't investigated, > but I guess the problem is that too often most of the blocks are relevant > for most of the header copies, either because of virtual ops or because > calculating the domfrontier of a block needs all domfrontiers of all > dominated childs (i.e. the domfrontier of the entry needs domfrontiers of > all blocks). So, no cake there. > > But actually there's no reason that we need to keep SSA form uptodate > during the multiple header copyings. We use gimple_duplicate_sese_region > to do the copying which updates the SSA web before returning (actually > loop header copying is the only caller of it). The next thing done is > just another call to gimple_duplicate_sese_region to copy some other BBs, > then some split edges, repeat from start. We can just as well defer the > whole SSA web updating to after we've duplicated everything we want. > > That's what this patch does. Time for various things on the testcase > (with -O1): > > without with patch > tree SSA rewrite 26.2s 4.8s > tree SSA incremental 21.7s 4.6s > dominance computation 15.0s 4.2s > dominance frontiers 25.6s 6.7s > TOTAL 135.6s 67.8s > > Regstrapped on x86_64-linux, no regressions. Okay for trunk?
Ok. Thanks, Richard. > > Ciao, > Michael. > ------------------ > PR tree-optimization/46590 > > * tree-cfg.c (gimple_duplicate_sese_region): Don't update > SSA web here ... > * tree-ssa-loop-ch.c (copy_loop_headers): ... but here. > > Index: tree-cfg.c > =================================================================== > --- tree-cfg.c (revision 190803) > +++ tree-cfg.c (working copy) > @@ -5530,9 +5530,10 @@ add_phi_args_after_copy (basic_block *re > important exit edge EXIT. By important we mean that no SSA name defined > inside region is live over the other exit edges of the region. All entry > edges to the region must go to ENTRY->dest. The edge ENTRY is redirected > - to the duplicate of the region. SSA form, dominance and loop information > - is updated. The new basic blocks are stored to REGION_COPY in the same > - order as they had in REGION, provided that REGION_COPY is not NULL. > + to the duplicate of the region. Dominance and loop information is > + updated, but not the SSA web. The new basic blocks are stored to > + REGION_COPY in the same order as they had in REGION, provided that > + REGION_COPY is not NULL. > The function returns false if it is unable to copy the region, > true otherwise. */ > > @@ -5593,8 +5594,6 @@ gimple_duplicate_sese_region (edge entry > free_region_copy = true; > } > > - gcc_assert (!need_ssa_update_p (cfun)); > - > /* Record blocks outside the region that are dominated by something > inside. */ > doms = NULL; > @@ -5663,9 +5662,6 @@ gimple_duplicate_sese_region (edge entry > /* Add the other PHI node arguments. */ > add_phi_args_after_copy (region_copy, n_region, NULL); > > - /* Update the SSA web. */ > - update_ssa (TODO_update_ssa); > - > if (free_region_copy) > free (region_copy); > > Index: tree-ssa-loop-ch.c > =================================================================== > --- tree-ssa-loop-ch.c (revision 190803) > +++ tree-ssa-loop-ch.c (working copy) > @@ -241,6 +241,7 @@ copy_loop_headers (void) > split_edge (loop_latch_edge (loop)); > } > > + update_ssa (TODO_update_ssa); > free (bbs); > free (copied_bbs); >