On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <tom_devr...@mentor.com> wrote: > On 25/04/12 11:57, Richard Guenther wrote: > <SNIP> >>>> >> Hmm. I'm not sure we can conclude that they have the same value! >>>> >> >>>> >> +int bar (int); >>>> >> +void baz (int); >>>> >> +void bla (int); >>>> >> + >>>> >> +void >>>> >> +foo (int y) >>>> >> +{ >>>> >> + int a; >>>> >> + if (y == 6) >>>> >> + { >>>> >> + bla (5); >>>> >> + a = bar (7); >>>> >> + } >>>> >> + else >>>> >> + { >>>> >> + bla (5); >>>> >> + a = bar (7); >>>> >> + } >>>> >> + baz (a); >>>> >> +} >>>> >> >>>> >> I can implement bla as >>>> >> >>>> >> void bla(int) { a = random (); } >>>> >> >>>> >> thus a would not have the same value on both paths >>> > >>> > I think it's hard to decide whether they have the same value or not, >>> > since we >>> > cannot write a test-case that compares the results of calls from 2 >>> > branches of >>> > an if statement. >> void *foo () >> { >> return __builtin_return_address (0); >> } >> >> void *bar (_Bool b) >> { >> if (b) >> return foo (); >> else >> return foo (); >> } >> >> int main() >> { >> if (bar(true) == bar(false)) >> abort (); >> } >> >> ok ... outside of the scope of standard "C", but we certainly _can_ do this. >> Which would question tail-merging the above at all, of course. >> > > I added noinline to make sure there's no variation from inlining. > ... > extern void abort (void) __attribute__ ((__nothrow__)) __attribute__ > ((__noreturn__)); > > __attribute__((noinline)) > void *foo () > { > return __builtin_return_address (0); > } > > __attribute__((noinline)) > void *bar (int b) > { > if (b) > return foo (); > else > return foo (); > } > > int main() > { > void *a, *b; > a = bar (1); > b = bar (0); > if (a == b) > abort (); > return 0; > } > ... > > This test-case passes with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge > -fno-crossjumping > ... > > and fails with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge > -fcrossjumping > ... > > with the proposed patch, it also fails with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge > -fno-crossjumping > ... > > Tree-tail-merge performs the same transformation as cross-jumping for this > test-case. I think that the transformation is valid, and that the test-case > behaves as expected. Subsequent calls to foo may or may not return the same > value, depending on the optimizations, we can't assert a != b, that's just the > nature of __builtin_return_address. > > Btw, this example is not what I meant when I said 'a test-case that compares > the > results of calls from 2 branches of an if statement'. I tried to say that the > semantics of C is defined in terms of taking either 1 branch or the other, > rather than executing both in parallel, after which both results would be > available. From that point of view, this example compares subsequent calls to > foo, not parallel.
Ah, I see. Btw, I was not suggesting that we should not optimize the above. >>> > I started looking at the problem in terms of subsequent calls. Consider >>> > the >>> > imaginary attribute nosideeffect, similar to const and pure. >>> > >>> > const: >>> > - no effects except the return value >>> > - the return value depends only on the parameters >>> > >>> > pure: >>> > - no effects except the return value >>> > - the return value depends only on the parameters and/or global variables >>> > >>> > nosideeffect: >>> > - no effects except the return value >>> > >>> > The code fragment: >>> > ... >>> > extern int f (int) __attribute ((nosideeffect)); >>> > int g() >>> > { >>> > int a; >>> > int b; >>> > a = f (5); >>> > b = f (5); >>> > return a + b; >>> > } >>> > ... >>> > >>> > would result in a gimple sequence more or less like this: >>> > ... >>> > # VUSE <.MEMD.1712_4(D)> >>> > aD.1707_1 = fD.1704 (5); >>> > # VUSE <.MEMD.1712_4(D)> >>> > bD.1708_2 = fD.1704 (5); >>> > D.1710_3 = aD.1707_1 + bD.1708_2; >>> > # VUSE <.MEMD.1712_6> >>> > return D.1710_3; >>> > ... >>> > >>> > The value numbering modification I'm proposing would say that a and b >>> > have the >>> > same value, which is not true. I updated the patch to ensure in this case >>> > that a >>> > and b would be seen as different. >>> > Note that things won't break if the attribute is missing on a function, >>> > since >>> > that just means that the function would have a vdef, and there would be >>> > no 2 >>> > subsequent function calls with the same vuse. >>> > >>>> >> - but that is not what >>>> >> matters - because obviously we can still merge the blocks. >>> > >>> > Right. >>> > >>>> >> That is, >>>> >> we cannot be sure that the side-effects on memory of bla (5) and bla (5) >>>> >> are the same. But is that not what your value-numbering changes will >>>> >> assert? >>> > >>> > In the case of calls in different branches of an control flow statement, >>> > we can >>> > aggressively assume they're the same, since we cannot write a check that >>> > would >>> > start failing. >> Apart from the above ... which shows that even the returned values can >> matter. >> >>> > In the case of subsequent calls with side effects, the vuses will never >>> > be the same. >>> > >>> > In the case of subsequent calls without side effects, but not pure or >>> > const, we >>> > can't assume they have the same result. AFAIK, there's currently no way to >>> > specify such a function, but the updated patch should be able handle this. >>> > >>>> >> (not sure how you achieve this, but it looks like you insert/lookup >>>> >> general >>>> >> calls, thus not pure/const ones >>> > >>> > Correct. >>> > >>>> >> - that could easily lead to miscompiles(?)). >>>> >> >> And to increased compile-time (not sure if it matters). >> >>> > I have bootstrapped and reg-tested the updated (and the previous) patch on >>> > x86_64 (ada inclusive), and found no issues. >>> > >>> > Can you think of a test-case that fails with the previous or the updated >>> > patch? >> I see you do not handle >> > > I assume you mean 'not handle' in the sense of 'handle conservatively'. Yes. >> struct S { int i; }; >> struct S foo (void); >> struct S bar (void) >> { >> struct S s1, s2; >> if (...) >> s = foo (); >> else >> s = foo (); >> >> because the calls have a LHS that is not an SSA name. > > Indeed, the gvn patch handles this example conservatively, and tree-tail-merge > fails to optimize this test-case: > ... > struct S { int i; }; > extern struct S foo (void); > extern int foo2 (void); > struct S s; > int bar (int c) { > int r; > if (c) > { > s = foo (); > r = foo2 (); > } > else > { > s = foo (); > r = foo2 (); > } > return r; > } > ... > > A todo. > >> In general >> value-numbering virtual operands will change what further consumers lookup. >> So changing >> >> # MEM_2 = VUSE <MEM_3> >> foo (); >> # VUSE <MEM_2> >> ... = x; >> >> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use >> MEM_3, not considering a clobber by foo. > > AFAIU, the patch doesn't do this type of value numbering. > >> This might not actually be an >> issue (after all we do this for stores, too). >> >> I see you do not handle >> >> int bar (int); >> void baz (int); >> void bla (int); >> int s; >> void >> foo (int y) >> { >> int a; >> if (y == 6) >> { >> s = 1; >> bla (5); >> a = bar (7); >> } >> else >> { >> s = 1; >> bla (5); >> a = bar (7); >> } >> baz (a); >> } >> > > I think a more basic example is this one from PR52009, which is basically > about > stores: > ... > int z; > > void > foo (int y) > { > int a; > if (y == 6) > z = 5; > else > z = 5; > } > ... > > I proposed a patch for that separately: > http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the > patch now, I suppose I could use part of that patch as infrastructure for the > todo above. Yes, I think so. >> and I don't see how you need the new ref->vdef member > > I'm regarding the result of the ref as a compound <vdef, result>. Most of the > time I could (as I did in an earlier version of the patch) deduce vdef as > gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when > result is NULL_TREE. > >> - that is also >> not handled consistently (not in hashing or comparing, etc.). > > I handle the vdef as a result. Ok. Can you please rename ->vdef to ->result_vdef then? >> Can't you >> always use SSA_VAL (vuse)? > > I'm interested in storing the vdef, not the vuse. I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of the stmt defining vuse). No? >> >> Btw, what's >> >> + if (vdef) >> + VN_INFO (vdef)->use_processed = true; >> >> for? We arrive here by visiting the VDEF after all and visit_use does >> already >> set the use processed. Is it that we end up visiting the call twice because >> of the lhs SSA name definition and the virtual operand definition? > > Yes. > >> If so then >> visit_use should instead do >> >> FOR_EACH_SSA_DEF_OPERAND (...) >> VN_INFO (def)->use_processed = true; >> >> and defs_to_varying then no longer needs to do that. >> > > factored mark_use_processed out of defs_to_varying, calling mark_use_processed > at start of visit_use. Thanks. Which case hits SSA_NAME_DEF_STMT == NULL btw? The default-def for the virtual operand? >>> > Otherwise, ok for trunk? >> Let's give this one more iteration, but I guess the basic idea should work. >> > > bootstrapped and reg-tested on x86_64 (ada inclusive). > > Is this patch ok, or is the todo required? No, you can followup with that. Ok with s/vdef/result_vdef/ Thanks, Richard. > Thanks, > - Tom > > 2012-04-25 Tom de Vries <t...@codesourcery.com> > > PR tree-optimization/51879 > * tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field. > * tree-ssa-sccvn.c (mark_use_processed): New function, factored out > of ... > (defs_to_varying): ... here. > (visit_reference_op_call): Handle gimple_vdef. > Handle case that lhs is NULL_TREE. > (visit_use): Use mark_use_processed. Handle calls with side-effect > using visit_reference_op_call. > > gcc.dg/pr51879.c: New test. > gcc.dg/pr51879-2.c: Same. > gcc.dg/pr51879-3.c: Same. > gcc.dg/pr51879-4.c: Same. > gcc.dg/pr51879-6.c: Same.