On Mon, 9 Sep 2013, Steven Bosscher wrote:

> On Mon, Sep 9, 2013 at 10:01 AM, Richard Biener wrote:
> >> >> First, the loop passes that at the moment preceede IVOPTs leave
> >> >> around IL that is in desparate need of basic re-optimization
> >> >> like CSE, constant propagation and DCE.  That puts extra load
> >> >> on IVOPTs and its cost model, increasing compile-time and
> >> >> possibly confusing it.
> 
> So why not just run DCE and DOM just before IVOPTs?

Another DCE and DOM?  The patch moving IVOPTs just moves it
after the existing DOM (right before the last DCE).

The question is whether we want to pay the compile-time penalty
of not making passes deal with dead code, existing full redundancies,
not constaint/copy-propagated IL, etc. by inserting these
passes over and over in random places.  Of course CSE (being it
DOM or FRE) are not exactly cheap.

> >> >> Second, IVOPTs introduces lowered memory accesses that it
> >> >> expects to stay as is, likewise it produces auto-inc/dec
> >> >> sequences that it expects to stay as is until RTL expansion.
> >> >> Passes such as DOM can break this expectation and make the
> >> >> work done by IVOPTs a waste.
> 
> But IVOPTs leaves its own messy code behind (see below).
> 
> 
> >> >> I remember doing this excercise in the GCC 4.3 timeframe where
> >> >> benchmarking on x86_64 showed no gains or losses (but x86_64
> >> >> isn't very sensitive to IV choices).
> >> >>
> >> >> Any help with benchmarking this on targets other than x86_64
> >> >> is appreciated (I'll re-do x86_64).
> 
> Targets like POWER and ARM would be interesting to test on.
> 
> 
> > We already run LIM twice, moving the one that is currently after
> > IVOPTs as well should be easy.  But of course as you note IVOPTs
> > may introduce loop invariant code it also may introduce full
> > redundancies in the way it re-writes IVs.  And for both people may
> > claim that we have both CSE and LIM on the RTL level, too.
> 
> I would claim that relying on RTL (G)CSE and RTL LIM is a step in the
> wrong direction. You end up creating a lot of garbage RTL, and many
> transformations that are easy on GIMPLE cannot be performed anymore on
> RTL.

;)

> Is it possible to make IVOPTs clean up after itself? It should be easy
> for IVOPTs to notice that it creates loop-invariant code, and position
> it on the loop pre-header edge. I suppose full redundancies are
> harder, but I would expect that to happen less frequently (the only
> situation I can think of right now is where a loop is rewritten with
> two IVs where the two IVs share a common sub-expression).

The issue is that IVOPTs generates a tree for each IV use
replacement and simply feeds it through force_gimple_operand.
In theory it should be possible to not go affine-comb -> tree -> gimple
but instead directly affine-comb -> gimple, on the way doing the
re-association Bin Cheng notes that would be nice to have.

IVOPTs tries to clean the IL from dead code (unused IVs) but ISTR
it fails for some cases.

Richard.

Reply via email to