On Wed, 7 Dec 2016, Jeff Law wrote:

> On 12/06/2016 03:16 AM, Richard Biener wrote:
> > On Tue, 6 Dec 2016, Richard Biener wrote:
> > 
> > > On Mon, 5 Dec 2016, Jeff Law wrote:
> > > 
> > > > On 12/02/2016 01:33 AM, Richard Biener wrote:
> > > > > > The LHS on the assignment makes it easier to identify when a tail
> > > > > > call is
> > > > > > possible.  It's not needed for correctness.  Not having the LHS on
> > > > > > the
> > > > > > assignment just means we won't get an optimized tail call.
> > > > > > 
> > > > > > Under what circumstances would the LHS possibly be removed?  We know
> > > > > > the
> > > > > > return statement references the LHS, so it's not going to be
> > > > > > something
> > > > > > that
> > > > > > DCE will do.
> > > > > 
> > > > > Well, I thought Prathamesh added the optimization to copy-propagate
> > > > > the lhs from the returned argument.  So we'd have both transforms
> > > > > here.
> > > > That seems like a mistake -- the fact that we can copy propagate the LHS
> > > > from
> > > > the returned argument is interesting, but in practice I've found it to
> > > > not be
> > > > useful to do so.
> > > > 
> > > > The problem is it makes the value look live across a the call and we're
> > > > then
> > > > dependent upon the register allocator to know the trick about the
> > > > returned
> > > > argument value and apply it consistently -- which it does not last I
> > > > checked.
> > > > 
> > > > I think we're better off leaving the call in the form of LHS = call ()
> > > > if the
> > > > return value is used.  That's going to be more palatable to tail
> > > > calling.
> > > 
> > > Yes, that's something I also raised earlier in the thread.  Note that
> > > any kind of value-numbering probably wants to know the equivalence
> > > for simplifications but in the end wants to disable propagating the
> > > copy (in fact it should propagate the return value from the point of
> > > the call).  I suppose I know how to implement that in FRE/PRE given it has
> > > separate value-numbering and elimination phases.  Something for GCC 8.
> > 
> > The following does that (it shows we don't handle separating LHS
> > and overall stmt effect very well).  It optimizes a testcase like
> > 
> > void *foo (void *p, int c, __SIZE_TYPE__ n)
> > {
> >   void *q = __builtin_memset (p, c, n);
> >   if (q == p)
> >     return p;
> >   return q;
> > }
> > 
> > to
> > 
> > foo (void * p, int c, long unsigned int n)
> > {
> >   void * q;
> > 
> >   <bb 2> [0.0%]:
> >   q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D));
> >   return q_7;
> > 
> > in early FRE.
> Yea.  Not sure how often something like that would happen in practice, but
> using the equivalence to simplify rather than for propagation seems like the
> way to go.
> 
> I keep thinking about doing some similar in DOM, but haven't gotten around to
> seeing what the fallout would be.

Shouldn't be too bad (it would require to keep an additional 
what-to-substitute-for-value-X lattice during the DOM walk).  But it
will still require some "magic" to decide about those conditional
equivalences... (I think).

Separating "values" from what we substitute during elimination is a good
thing in general, so we can be more aggressive with the value parts.

Reply via email to