> I don't think we handle
> 
>  mem = foo ();

Hmm, we can't
struct val {int a,b;};

[[gnu::noinline]]
struct val dup(int *a)
{
        a[0]=a[1];
        struct val ret;
        ret.a = a[2];
        ret.b = a[3];
        return ret;
}
int
test (int *b)
{
        struct val ret = dup (b);
        struct val ret2 = dup (b);
        return ret.a + ret.b + ret2.a + ret2.b;
}

Also for 

struct val {int a,b;} v,v1,v2;

void
test ()
{
  v1=v;
  v2=v;
  if (v1.a != v2.a)
          __builtin_abort ();
}

We stil have in optimized dump:

void test ()
{
  int _1;
  int _2;

  <bb 2> [local count: 1073741824]:
  v1 = v;
  v2 = v;
  _1 = v1.a;
  _2 = v2.a;
  if (_1 != _2)
    goto <bb 3>; [0.00%]
  else
    goto <bb 4>; [100.00%]

  <bb 3> [count: 0]:
  __builtin_abort ();

  <bb 4> [local count: 1073741824]:
  return;

}

We eventually get rid of abort in combine, but that is truly late.
I think it only gets optimized because val is small structure. For
bigger structure we would resort to memcpy and RTL optimizers would give
up too.

I tought VN is able to handle this by assigning value numbers for store
destinations, but I see it only happens since most stores have VN for
their RHS.

> correctly in VN, the && lhs condition also guards this.  Maybe
> instead refactor this and the check a few lines above to check
> (!lhs || TREE_CODE (lhs) == SSA_NAME)

OK, I tought this would be way too easy :)
so we need SSA lhs + we need to check that memory defined by first call
is not modified in between.
> 
> ?  The VUSE->VDEF chain walking also doesn't consider the call
> having memory side-effects since it effectively skips intermittent
> uses.  So I believe you have to adjust (or guard) that as well,
> alternatively visit all uses for each def walked.
> 
> >        && gimple_vuse (stmt)
> >        && (((summary = get_modref_function_summary (stmt, NULL))
> >        && !summary->global_memory_read
> > @@ -6354,19 +6352,18 @@ visit_stmt (gimple *stmt, bool backedges_varying_p 
> > = false)
> >  
> >        /* Pick up flags from a devirtualization target.  */
> >        tree fn = gimple_call_fn (stmt);
> > -      int extra_fnflags = 0;
> >        if (fn && TREE_CODE (fn) == SSA_NAME)
> >     {
> >       fn = SSA_VAL (fn);
> >       if (TREE_CODE (fn) == ADDR_EXPR
> >           && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL)
> > -       extra_fnflags = flags_from_decl_or_type (TREE_OPERAND (fn, 0));
> > +       fn = TREE_OPERAND (fn, 0);
> > +     else
> > +       fn = NULL;
> >     }
> > -      if ((/* Calls to the same function with the same vuse
> > -         and the same operands do not necessarily return the same
> > -         value, unless they're pure or const.  */
> > -      ((gimple_call_flags (call_stmt) | extra_fnflags)
> > -       & (ECF_PURE | ECF_CONST))
> > +      else
> > +   fn = NULL;
> > +      if ((ipa_modref_can_remove_redundat_calls (call_stmt, fn)
> >        /* If calls have a vdef, subsequent calls won't have
> >           the same incoming vuse.  So, if 2 calls with vdef have the
> >           same vuse, we know they're not subsequent.
> 
> With disregarding VDEF this comment is now wrong (it's directed at
> tail-merging btw).
> 
> I'll note that elimination will only be able to "DCE" calls with a
> LHS since "DCE" happens by replacing the RHS.  That's also what the
> && lhs check is about - we don't do anything useful during elimination
> when there's no LHS but the call itself is present in the expression
> hash.

It would be nice to handle this, since a lot of code in C "returns"
agregates by output parameter.

Note that Jens proposed to fill defect report to C23 to allow removal of
reproducible/unsequenced calls which would make it possible to DCE/DSE them
as well.  This looks like a good idea to me.

Honza
> 
> Richard.

Reply via email to