On Tue, 6 Dec 2016, Richard Biener wrote: > On Mon, 5 Dec 2016, Jeff Law wrote: > > > On 12/02/2016 01:33 AM, Richard Biener wrote: > > > > The LHS on the assignment makes it easier to identify when a tail call > > > > is > > > > possible. It's not needed for correctness. Not having the LHS on the > > > > assignment just means we won't get an optimized tail call. > > > > > > > > Under what circumstances would the LHS possibly be removed? We know the > > > > return statement references the LHS, so it's not going to be something > > > > that > > > > DCE will do. > > > > > > Well, I thought Prathamesh added the optimization to copy-propagate > > > the lhs from the returned argument. So we'd have both transforms here. > > That seems like a mistake -- the fact that we can copy propagate the LHS > > from > > the returned argument is interesting, but in practice I've found it to not > > be > > useful to do so. > > > > The problem is it makes the value look live across a the call and we're then > > dependent upon the register allocator to know the trick about the returned > > argument value and apply it consistently -- which it does not last I > > checked. > > > > I think we're better off leaving the call in the form of LHS = call () if > > the > > return value is used. That's going to be more palatable to tail calling. > > Yes, that's something I also raised earlier in the thread. Note that > any kind of value-numbering probably wants to know the equivalence > for simplifications but in the end wants to disable propagating the > copy (in fact it should propagate the return value from the point of > the call). I suppose I know how to implement that in FRE/PRE given it has > separate value-numbering and elimination phases. Something for GCC 8.
The following does that (it shows we don't handle separating LHS and overall stmt effect very well). It optimizes a testcase like void *foo (void *p, int c, __SIZE_TYPE__ n) { void *q = __builtin_memset (p, c, n); if (q == p) return p; return q; } to foo (void * p, int c, long unsigned int n) { void * q; <bb 2> [0.0%]: q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D)); return q_7; in early FRE. Index: gcc/tree-ssa-pre.c =================================================================== --- gcc/tree-ssa-pre.c (revision 243282) +++ gcc/tree-ssa-pre.c (working copy) @@ -4065,10 +4068,11 @@ eliminate_avail (tree op) tree valnum = VN_INFO (op)->valnum; if (TREE_CODE (valnum) == SSA_NAME) { + if (el_avail.length () > SSA_NAME_VERSION (valnum) + && el_avail[SSA_NAME_VERSION (valnum)]) + return el_avail[SSA_NAME_VERSION (valnum)]; if (SSA_NAME_IS_DEFAULT_DEF (valnum)) return valnum; - if (el_avail.length () > SSA_NAME_VERSION (valnum)) - return el_avail[SSA_NAME_VERSION (valnum)]; } else if (is_gimple_min_invariant (valnum)) return valnum; @@ -4263,6 +4275,14 @@ eliminate_dom_walker::before_dom_childre eliminate_push_avail (sprime); } + /* While we might have value numbered a call return value to + one of its arguments the call itself might not be solely + represented by its return value. Thus do not ignore + side-effects indicated by a VARYING vdef. */ + if (gimple_vdef (stmt) + && VN_INFO (gimple_vdef (stmt))->valnum == gimple_vdef (stmt)) + sprime = NULL_TREE; + /* If this now constitutes a copy duplicate points-to and range info appropriately. This is especially important for inserted code. See tree-ssa-copy.c Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c (revision 243282) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -3951,6 +4061,25 @@ visit_use (tree use) changed = defs_to_varying (call_stmt); goto done; } + + int rflags = gimple_call_return_flags (call_stmt); + if (rflags & ERF_RETURNS_ARG) + { + unsigned argnum = rflags & ERF_RETURN_ARG_MASK; + if (argnum < gimple_call_num_args (call_stmt)) + { + tree arg = gimple_call_arg (call_stmt, argnum); + if (TREE_CODE (arg) == SSA_NAME + || is_gimple_min_invariant (arg)) + { + changed = visit_copy (lhs, arg); + if (gimple_vdef (call_stmt)) + changed |= set_ssa_val_to (gimple_vdef (call_stmt), + gimple_vdef (call_stmt)); + goto done; + } + } + } } /* Pick up flags from a devirtualization target. */ > > > Of course as always the user could have written the code in this way. > > > > > > If the LHS is not required for correctness then I don't think we need > > > to put it there - Pratamesh verified the call is tail-called already > > > if marked by the tailcall pass, even if the LHS is not present. > > But if the function returns the value from the tail call, then going through > > an LHS is the right thing to do. Using the magic "argX will be the return > > value" seems clever, but actually hurts in practice. > > So we do want the reverse transform (for the case of returning by > reference that's going to be tricky if not impossible due to the > IL hygiene we enforce). > > Richard. > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)