On Wed, 7 Dec 2016, Jeff Law wrote: > On 12/06/2016 03:16 AM, Richard Biener wrote: > > On Tue, 6 Dec 2016, Richard Biener wrote: > > > > > On Mon, 5 Dec 2016, Jeff Law wrote: > > > > > > > On 12/02/2016 01:33 AM, Richard Biener wrote: > > > > > > The LHS on the assignment makes it easier to identify when a tail > > > > > > call is > > > > > > possible. It's not needed for correctness. Not having the LHS on > > > > > > the > > > > > > assignment just means we won't get an optimized tail call. > > > > > > > > > > > > Under what circumstances would the LHS possibly be removed? We know > > > > > > the > > > > > > return statement references the LHS, so it's not going to be > > > > > > something > > > > > > that > > > > > > DCE will do. > > > > > > > > > > Well, I thought Prathamesh added the optimization to copy-propagate > > > > > the lhs from the returned argument. So we'd have both transforms > > > > > here. > > > > That seems like a mistake -- the fact that we can copy propagate the LHS > > > > from > > > > the returned argument is interesting, but in practice I've found it to > > > > not be > > > > useful to do so. > > > > > > > > The problem is it makes the value look live across a the call and we're > > > > then > > > > dependent upon the register allocator to know the trick about the > > > > returned > > > > argument value and apply it consistently -- which it does not last I > > > > checked. > > > > > > > > I think we're better off leaving the call in the form of LHS = call () > > > > if the > > > > return value is used. That's going to be more palatable to tail > > > > calling. > > > > > > Yes, that's something I also raised earlier in the thread. Note that > > > any kind of value-numbering probably wants to know the equivalence > > > for simplifications but in the end wants to disable propagating the > > > copy (in fact it should propagate the return value from the point of > > > the call). I suppose I know how to implement that in FRE/PRE given it has > > > separate value-numbering and elimination phases. Something for GCC 8. > > > > The following does that (it shows we don't handle separating LHS > > and overall stmt effect very well). It optimizes a testcase like > > > > void *foo (void *p, int c, __SIZE_TYPE__ n) > > { > > void *q = __builtin_memset (p, c, n); > > if (q == p) > > return p; > > return q; > > } > > > > to > > > > foo (void * p, int c, long unsigned int n) > > { > > void * q; > > > > <bb 2> [0.0%]: > > q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D)); > > return q_7; > > > > in early FRE. > Yea. Not sure how often something like that would happen in practice, but > using the equivalence to simplify rather than for propagation seems like the > way to go. > > I keep thinking about doing some similar in DOM, but haven't gotten around to > seeing what the fallout would be.
Shouldn't be too bad (it would require to keep an additional what-to-substitute-for-value-X lattice during the DOM walk). But it will still require some "magic" to decide about those conditional equivalences... (I think). Separating "values" from what we substitute during elimination is a good thing in general, so we can be more aggressive with the value parts.