On 25/04/12 11:57, Richard Guenther wrote: <SNIP> >>> >> Hmm. I'm not sure we can conclude that they have the same value! >>> >> >>> >> +int bar (int); >>> >> +void baz (int); >>> >> +void bla (int); >>> >> + >>> >> +void >>> >> +foo (int y) >>> >> +{ >>> >> + int a; >>> >> + if (y == 6) >>> >> + { >>> >> + bla (5); >>> >> + a = bar (7); >>> >> + } >>> >> + else >>> >> + { >>> >> + bla (5); >>> >> + a = bar (7); >>> >> + } >>> >> + baz (a); >>> >> +} >>> >> >>> >> I can implement bla as >>> >> >>> >> void bla(int) { a = random (); } >>> >> >>> >> thus a would not have the same value on both paths >> > >> > I think it's hard to decide whether they have the same value or not, since >> > we >> > cannot write a test-case that compares the results of calls from 2 >> > branches of >> > an if statement. > void *foo () > { > return __builtin_return_address (0); > } > > void *bar (_Bool b) > { > if (b) > return foo (); > else > return foo (); > } > > int main() > { > if (bar(true) == bar(false)) > abort (); > } > > ok ... outside of the scope of standard "C", but we certainly _can_ do this. > Which would question tail-merging the above at all, of course. >
I added noinline to make sure there's no variation from inlining. ... extern void abort (void) __attribute__ ((__nothrow__)) __attribute__ ((__noreturn__)); __attribute__((noinline)) void *foo () { return __builtin_return_address (0); } __attribute__((noinline)) void *bar (int b) { if (b) return foo (); else return foo (); } int main() { void *a, *b; a = bar (1); b = bar (0); if (a == b) abort (); return 0; } ... This test-case passes with: ... gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping ... and fails with: ... gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping ... with the proposed patch, it also fails with: ... gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping ... Tree-tail-merge performs the same transformation as cross-jumping for this test-case. I think that the transformation is valid, and that the test-case behaves as expected. Subsequent calls to foo may or may not return the same value, depending on the optimizations, we can't assert a != b, that's just the nature of __builtin_return_address. Btw, this example is not what I meant when I said 'a test-case that compares the results of calls from 2 branches of an if statement'. I tried to say that the semantics of C is defined in terms of taking either 1 branch or the other, rather than executing both in parallel, after which both results would be available. From that point of view, this example compares subsequent calls to foo, not parallel. >> > I started looking at the problem in terms of subsequent calls. Consider the >> > imaginary attribute nosideeffect, similar to const and pure. >> > >> > const: >> > - no effects except the return value >> > - the return value depends only on the parameters >> > >> > pure: >> > - no effects except the return value >> > - the return value depends only on the parameters and/or global variables >> > >> > nosideeffect: >> > - no effects except the return value >> > >> > The code fragment: >> > ... >> > extern int f (int) __attribute ((nosideeffect)); >> > int g() >> > { >> > int a; >> > int b; >> > a = f (5); >> > b = f (5); >> > return a + b; >> > } >> > ... >> > >> > would result in a gimple sequence more or less like this: >> > ... >> > # VUSE <.MEMD.1712_4(D)> >> > aD.1707_1 = fD.1704 (5); >> > # VUSE <.MEMD.1712_4(D)> >> > bD.1708_2 = fD.1704 (5); >> > D.1710_3 = aD.1707_1 + bD.1708_2; >> > # VUSE <.MEMD.1712_6> >> > return D.1710_3; >> > ... >> > >> > The value numbering modification I'm proposing would say that a and b have >> > the >> > same value, which is not true. I updated the patch to ensure in this case >> > that a >> > and b would be seen as different. >> > Note that things won't break if the attribute is missing on a function, >> > since >> > that just means that the function would have a vdef, and there would be no >> > 2 >> > subsequent function calls with the same vuse. >> > >>> >> - but that is not what >>> >> matters - because obviously we can still merge the blocks. >> > >> > Right. >> > >>> >> That is, >>> >> we cannot be sure that the side-effects on memory of bla (5) and bla (5) >>> >> are the same. But is that not what your value-numbering changes will >>> >> assert? >> > >> > In the case of calls in different branches of an control flow statement, >> > we can >> > aggressively assume they're the same, since we cannot write a check that >> > would >> > start failing. > Apart from the above ... which shows that even the returned values can matter. > >> > In the case of subsequent calls with side effects, the vuses will never be >> > the same. >> > >> > In the case of subsequent calls without side effects, but not pure or >> > const, we >> > can't assume they have the same result. AFAIK, there's currently no way to >> > specify such a function, but the updated patch should be able handle this. >> > >>> >> (not sure how you achieve this, but it looks like you insert/lookup >>> >> general >>> >> calls, thus not pure/const ones >> > >> > Correct. >> > >>> >> - that could easily lead to miscompiles(?)). >>> >> > And to increased compile-time (not sure if it matters). > >> > I have bootstrapped and reg-tested the updated (and the previous) patch on >> > x86_64 (ada inclusive), and found no issues. >> > >> > Can you think of a test-case that fails with the previous or the updated >> > patch? > I see you do not handle > I assume you mean 'not handle' in the sense of 'handle conservatively'. > struct S { int i; }; > struct S foo (void); > struct S bar (void) > { > struct S s1, s2; > if (...) > s = foo (); > else > s = foo (); > > because the calls have a LHS that is not an SSA name. Indeed, the gvn patch handles this example conservatively, and tree-tail-merge fails to optimize this test-case: ... struct S { int i; }; extern struct S foo (void); extern int foo2 (void); struct S s; int bar (int c) { int r; if (c) { s = foo (); r = foo2 (); } else { s = foo (); r = foo2 (); } return r; } ... A todo. > In general > value-numbering virtual operands will change what further consumers lookup. > So changing > > # MEM_2 = VUSE <MEM_3> > foo (); > # VUSE <MEM_2> > ... = x; > > by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use > MEM_3, not considering a clobber by foo. AFAIU, the patch doesn't do this type of value numbering. > This might not actually be an > issue (after all we do this for stores, too). > > I see you do not handle > > int bar (int); > void baz (int); > void bla (int); > int s; > void > foo (int y) > { > int a; > if (y == 6) > { > s = 1; > bla (5); > a = bar (7); > } > else > { > s = 1; > bla (5); > a = bar (7); > } > baz (a); > } > I think a more basic example is this one from PR52009, which is basically about stores: ... int z; void foo (int y) { int a; if (y == 6) z = 5; else z = 5; } ... I proposed a patch for that separately: http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the patch now, I suppose I could use part of that patch as infrastructure for the todo above. > and I don't see how you need the new ref->vdef member I'm regarding the result of the ref as a compound <vdef, result>. Most of the time I could (as I did in an earlier version of the patch) deduce vdef as gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when result is NULL_TREE. > - that is also > not handled consistently (not in hashing or comparing, etc.). I handle the vdef as a result. > Can't you > always use SSA_VAL (vuse)? I'm interested in storing the vdef, not the vuse. > > Btw, what's > > + if (vdef) > + VN_INFO (vdef)->use_processed = true; > > for? We arrive here by visiting the VDEF after all and visit_use does already > set the use processed. Is it that we end up visiting the call twice because > of the lhs SSA name definition and the virtual operand definition? Yes. > If so then > visit_use should instead do > > FOR_EACH_SSA_DEF_OPERAND (...) > VN_INFO (def)->use_processed = true; > > and defs_to_varying then no longer needs to do that. > factored mark_use_processed out of defs_to_varying, calling mark_use_processed at start of visit_use. >> > Otherwise, ok for trunk? > Let's give this one more iteration, but I guess the basic idea should work. > bootstrapped and reg-tested on x86_64 (ada inclusive). Is this patch ok, or is the todo required? Thanks, - Tom 2012-04-25 Tom de Vries <t...@codesourcery.com> PR tree-optimization/51879 * tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field. * tree-ssa-sccvn.c (mark_use_processed): New function, factored out of ... (defs_to_varying): ... here. (visit_reference_op_call): Handle gimple_vdef. Handle case that lhs is NULL_TREE. (visit_use): Use mark_use_processed. Handle calls with side-effect using visit_reference_op_call. gcc.dg/pr51879.c: New test. gcc.dg/pr51879-2.c: Same. gcc.dg/pr51879-3.c: Same. gcc.dg/pr51879-4.c: Same. gcc.dg/pr51879-6.c: Same.
Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c (revision 185028) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -2525,6 +2525,30 @@ set_ssa_val_to (tree from, tree to) return false; } +/* Mark as processed all the definitions in the defining stmt of USE, or + the USE itself. */ + +static void +mark_use_processed (tree use) +{ + ssa_op_iter iter; + def_operand_p defp; + gimple stmt = SSA_NAME_DEF_STMT (use); + + if (!stmt || gimple_code (stmt) == GIMPLE_PHI) + { + VN_INFO (use)->use_processed = true; + return; + } + + FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS) + { + tree def = DEF_FROM_PTR (defp); + + VN_INFO (def)->use_processed = true; + } +} + /* Set all definitions in STMT to value number to themselves. Return true if a value number changed. */ @@ -2538,8 +2562,6 @@ defs_to_varying (gimple stmt) FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS) { tree def = DEF_FROM_PTR (defp); - - VN_INFO (def)->use_processed = true; changed |= set_ssa_val_to (def, def); } return changed; @@ -2598,27 +2620,41 @@ visit_reference_op_call (tree lhs, gimpl { bool changed = false; struct vn_reference_s vr1; - tree result; + vn_reference_t vnresult = NULL; tree vuse = gimple_vuse (stmt); + tree vdef = gimple_vdef (stmt); vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE; vr1.operands = valueize_shared_reference_ops_from_call (stmt); vr1.type = gimple_expr_type (stmt); vr1.set = 0; vr1.hashcode = vn_reference_compute_hash (&vr1); - result = vn_reference_lookup_1 (&vr1, NULL); - if (result) + vn_reference_lookup_1 (&vr1, &vnresult); + + if (vnresult) { - changed = set_ssa_val_to (lhs, result); - if (TREE_CODE (result) == SSA_NAME - && VN_INFO (result)->has_constants) - VN_INFO (lhs)->has_constants = true; + if (vnresult->vdef) + changed |= set_ssa_val_to (vdef, vnresult->vdef); + + if (!vnresult->result && lhs) + vnresult->result = lhs; + + if (vnresult->result && lhs) + { + changed |= set_ssa_val_to (lhs, vnresult->result); + + if (VN_INFO (vnresult->result)->has_constants) + VN_INFO (lhs)->has_constants = true; + } } else { void **slot; vn_reference_t vr2; - changed = set_ssa_val_to (lhs, lhs); + if (vdef) + changed |= set_ssa_val_to (vdef, vdef); + if (lhs) + changed |= set_ssa_val_to (lhs, lhs); vr2 = (vn_reference_t) pool_alloc (current_info->references_pool); vr2->vuse = vr1.vuse; vr2->operands = valueize_refs (create_reference_ops_from_call (stmt)); @@ -2626,6 +2662,7 @@ visit_reference_op_call (tree lhs, gimpl vr2->set = vr1.set; vr2->hashcode = vr1.hashcode; vr2->result = lhs; + vr2->vdef = vdef; slot = htab_find_slot_with_hash (current_info->references, vr2, vr2->hashcode, INSERT); if (*slot) @@ -2795,7 +2832,6 @@ visit_reference_op_store (tree lhs, tree going to valueize the references in-place. */ if ((vdef = gimple_vdef (stmt))) { - VN_INFO (vdef)->use_processed = true; changed |= set_ssa_val_to (vdef, vdef); } @@ -2817,7 +2853,6 @@ visit_reference_op_store (tree lhs, tree def = gimple_vdef (stmt); use = gimple_vuse (stmt); - VN_INFO (def)->use_processed = true; changed |= set_ssa_val_to (def, SSA_VAL (use)); } @@ -3167,7 +3202,7 @@ visit_use (tree use) bool changed = false; gimple stmt = SSA_NAME_DEF_STMT (use); - VN_INFO (use)->use_processed = true; + mark_use_processed (use); gcc_assert (!SSA_NAME_IN_FREE_LIST (use)); if (dump_file && (dump_flags & TDF_DETAILS) @@ -3186,8 +3221,7 @@ visit_use (tree use) { if (gimple_code (stmt) == GIMPLE_PHI) changed = visit_phi (stmt); - else if (!gimple_has_lhs (stmt) - || gimple_has_volatile_ops (stmt)) + else if (gimple_has_volatile_ops (stmt)) changed = defs_to_varying (stmt); else if (is_gimple_assign (stmt)) { @@ -3349,34 +3383,44 @@ visit_use (tree use) /* ??? We could try to simplify calls. */ - if (stmt_has_constants (stmt) - && TREE_CODE (lhs) == SSA_NAME) - VN_INFO (lhs)->has_constants = true; - else if (TREE_CODE (lhs) == SSA_NAME) + if (lhs && TREE_CODE (lhs) == SSA_NAME) { - /* We reset expr and constantness here because we may - have been value numbering optimistically, and - iterating. They may become non-constant in this case, - even if they were optimistically constant. */ - VN_INFO (lhs)->has_constants = false; - VN_INFO (lhs)->expr = NULL_TREE; + if (stmt_has_constants (stmt)) + VN_INFO (lhs)->has_constants = true; + else + { + /* We reset expr and constantness here because we may + have been value numbering optimistically, and + iterating. They may become non-constant in this case, + even if they were optimistically constant. */ + VN_INFO (lhs)->has_constants = false; + VN_INFO (lhs)->expr = NULL_TREE; + } + + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + changed = defs_to_varying (stmt); + goto done; + } } - if (TREE_CODE (lhs) == SSA_NAME - && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) - changed = defs_to_varying (stmt); /* ??? We should handle stores from calls. */ - else if (TREE_CODE (lhs) == SSA_NAME) + if (!gimple_call_internal_p (stmt) + && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST) + /* If the call has side effects, subsequent calls won't have + the same incoming vuse, so it's save to assume + equality. */ + || gimple_has_side_effects (stmt)) + && ((lhs && TREE_CODE (lhs) == SSA_NAME) + || (!lhs && gimple_vdef (stmt)))) { - if (!gimple_call_internal_p (stmt) - && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)) - changed = visit_reference_op_call (lhs, stmt); - else - changed = defs_to_varying (stmt); + changed = visit_reference_op_call (lhs, stmt); } else changed = defs_to_varying (stmt); } + else + changed = defs_to_varying (stmt); } done: return changed; Index: gcc/tree-ssa-sccvn.h =================================================================== --- gcc/tree-ssa-sccvn.h (revision 185028) +++ gcc/tree-ssa-sccvn.h (working copy) @@ -110,6 +110,7 @@ typedef struct vn_reference_s tree type; VEC (vn_reference_op_s, heap) *operands; tree result; + tree vdef; } *vn_reference_t; typedef const struct vn_reference_s *const_vn_reference_t; Index: gcc/testsuite/gcc.dg/pr51879-2.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-2.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y) + baz (bar (7) + 6); + else + baz (bar (7) + 6); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { scan-tree-dump-times "baz \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-6.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-6.c (revision 0) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + + +int bar (int); +void baz (int); +void bla (int); + +void +foo (int y) +{ + int a; + if (y == 6) + { + bla (5); + a = bar (7); + } + else + { + bla (5); + a = bar (7); + } + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y == 6) + a = bar (7); + else + a = bar (7); + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-3.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-3.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y == 6) + a = bar (7) + 6; + else + a = bar (7) + 6; + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-4.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-4.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +int foo (int y) +{ + int a, b; + a = bar (7) + 6; + b = bar (7) + 6; + return a + b; +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 2 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */