https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #62 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #61)
> (In reply to Richard Biener from comment #60)
> > (In reply to Tamar Christina from comment #59)
> > > after ifcvt we end up with:
> > > 
> > >   _162 = chrg_init_70 * iftmp.8_76;
> > >   _164 = ABS_EXPR <_162>;
> > >   _167 = -_164;
> > >   _ifc__166 = distbb_74 < iftmp.0_97 ? _167 : 0.0;
> > >   prephitmp_169 = distbb_74 >= 0.0 ? _ifc__166 : _168;
> > >   
> > > instead of
> > > 
> > >   _160 = chrg_init_75 * iftmp.8_80;
> > >   prephitmp_161 = distbb_79 < 0.0 ? chrg_init_75 : _160;
> > >   _164 = ABS_EXPR <prephitmp_161>;
> > >   _166 = -_164;
> > >   prephitmp_167 = distbb_79 < iftmp.0_96 ? _166 : 0.0;
> > > 
> > > previously we'd make COND_MUL and COND_NEG and so don't need a VCOND in 
> > > the
> > > end,
> > > now we select after the multiplication, so we only have a COND_NEG 
> > > followed
> > > by a VCOND.
> > > 
> > > This is obviously worse, but I have no idea how to recover it.  Any ideas?
> > 
> > None.  This is with -O3, right?  Can you try selectively disabling parts
> > of PRE with -fno-tree-partial-pre -fno-code-hoisting?  But I suspect it's
> > the improvement for general PRE that we hit here.
> > 
> 
> Those don't seem to make a difference sadly.
> 
> > One idea that was always floating around was to move PRE after loop opts
> > like we did with predcom.  But the no PRE before loop will likely hurt as
> > well
> > so we might instead want to limit PRE when it involves generating
> > constants in PHIs and schedule another PRE after loop opts (at some cost
> > then).  It's something to experiment with ...
> 
> It looks like `-fno-tree-pre` does the trick, but then of course, messes up
> elsewhere.  The conditional statement seem to stay in the most complicated
> form possible in scalar code.
> 
> I'll try to track down what to turn off and experiment with a pre2 after
> vect.
> Is before predcom a good place?

I would avoid putting it into the loop pipeline.  Instead I'd turn the
FRE pass that runs after tracer into PRE.  Maybe conditional on whether
there are any loops.

Note it's not so easy to "tame" PRE, the existing things happen at
elimination time in eliminate_dom_walker::eliminate_stmt.  I would
experiment with restricting the use of inserted PHIs in innermost(!)
loops containing invariants, maybe only if the number of PHI args is
more than two ... (but that's somewhat artificial).

That said, I'm not really convinced this is a good idea.

Reply via email to