> I don't think we handle > > mem = foo (); Hmm, we can't struct val {int a,b;};
[[gnu::noinline]] struct val dup(int *a) { a[0]=a[1]; struct val ret; ret.a = a[2]; ret.b = a[3]; return ret; } int test (int *b) { struct val ret = dup (b); struct val ret2 = dup (b); return ret.a + ret.b + ret2.a + ret2.b; } Also for struct val {int a,b;} v,v1,v2; void test () { v1=v; v2=v; if (v1.a != v2.a) __builtin_abort (); } We stil have in optimized dump: void test () { int _1; int _2; <bb 2> [local count: 1073741824]: v1 = v; v2 = v; _1 = v1.a; _2 = v2.a; if (_1 != _2) goto <bb 3>; [0.00%] else goto <bb 4>; [100.00%] <bb 3> [count: 0]: __builtin_abort (); <bb 4> [local count: 1073741824]: return; } We eventually get rid of abort in combine, but that is truly late. I think it only gets optimized because val is small structure. For bigger structure we would resort to memcpy and RTL optimizers would give up too. I tought VN is able to handle this by assigning value numbers for store destinations, but I see it only happens since most stores have VN for their RHS. > correctly in VN, the && lhs condition also guards this. Maybe > instead refactor this and the check a few lines above to check > (!lhs || TREE_CODE (lhs) == SSA_NAME) OK, I tought this would be way too easy :) so we need SSA lhs + we need to check that memory defined by first call is not modified in between. > > ? The VUSE->VDEF chain walking also doesn't consider the call > having memory side-effects since it effectively skips intermittent > uses. So I believe you have to adjust (or guard) that as well, > alternatively visit all uses for each def walked. > > > && gimple_vuse (stmt) > > && (((summary = get_modref_function_summary (stmt, NULL)) > > && !summary->global_memory_read > > @@ -6354,19 +6352,18 @@ visit_stmt (gimple *stmt, bool backedges_varying_p > > = false) > > > > /* Pick up flags from a devirtualization target. */ > > tree fn = gimple_call_fn (stmt); > > - int extra_fnflags = 0; > > if (fn && TREE_CODE (fn) == SSA_NAME) > > { > > fn = SSA_VAL (fn); > > if (TREE_CODE (fn) == ADDR_EXPR > > && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL) > > - extra_fnflags = flags_from_decl_or_type (TREE_OPERAND (fn, 0)); > > + fn = TREE_OPERAND (fn, 0); > > + else > > + fn = NULL; > > } > > - if ((/* Calls to the same function with the same vuse > > - and the same operands do not necessarily return the same > > - value, unless they're pure or const. */ > > - ((gimple_call_flags (call_stmt) | extra_fnflags) > > - & (ECF_PURE | ECF_CONST)) > > + else > > + fn = NULL; > > + if ((ipa_modref_can_remove_redundat_calls (call_stmt, fn) > > /* If calls have a vdef, subsequent calls won't have > > the same incoming vuse. So, if 2 calls with vdef have the > > same vuse, we know they're not subsequent. > > With disregarding VDEF this comment is now wrong (it's directed at > tail-merging btw). > > I'll note that elimination will only be able to "DCE" calls with a > LHS since "DCE" happens by replacing the RHS. That's also what the > && lhs check is about - we don't do anything useful during elimination > when there's no LHS but the call itself is present in the expression > hash. It would be nice to handle this, since a lot of code in C "returns" agregates by output parameter. Note that Jens proposed to fill defect report to C23 to allow removal of reproducible/unsequenced calls which would make it possible to DCE/DSE them as well. This looks like a good idea to me. Honza > > Richard.