On Fri, May 21, 2010 at 10:29 PM, Xinliang David Li <davi...@google.com> wrote: > On Fri, May 21, 2010 at 10:35 AM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Fri, May 21, 2010 at 7:30 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davi...@google.com> >>>> wrote: >>>>> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb....@gmail.com> >>>>> wrote: >>>>>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davi...@google.com> >>>>>> wrote: >>>>>>> stack variable overlay and stack slot assignments is here too. >>>>>> >>>>>> Yes, and for these I would like to add a separate timevar. Agree? >>>>> >>>>> Yes. (By the way, we are rewriting this pass to eliminate the code >>>>> motion/aliasing problem -- but that is a different topic). >>>> >>>> Btw, we want to address the same problem by representing the >>>> points where (big) variables go out-of scope in the IL, also to >>>> help DSE. The original idea was to simply drop in an aggregate >>>> assignment from an undefined value at the end of the scope >>>> during lowering, like >>>> >>>> var = {undefined}; >>>> >>> >>> This looks like a very interesting approach. Do you see any downside >>> of this approach? What is the problem of handling (nullifying) the >>> dummy statement in expansion pass? >> >> That is what I'd have done initially. I could in theory see RTL >> code motion optimizations move stuff in an invalid way after that >> (but we try to avoid this by properly sharing TBAA compatible >> slots only and fixing up points-to information as well). >> >> So in the end it'll probably just work dropping the assignments >> on the floor during expansion to RTL. >> >>> The approach we took is different --- we move this overlay/packing >>> earlier (after ipa-inlining). One of the other motivation for doing >>> this is due to the limitation in current implementation that leaves >>> out many overlaying opportunities (e.g. structs with union members can >>> not share slots etc), but this is a probably independent issue. >> >> Yes, one earlier idea would have unified stack slots at gimple lowering >> time. I'm not sure that after ipa-inlining is early enough (probably >> it is due to the lack of code motion optimizations). > > Yes -- doing it after inlining is important as they are the major > contributors of the stack sharing opportunities. > >> >> With the extra assignments I was also hoping to help analysis phases >> to note that for example in >> >> { >> int a[10]; >> foo (a); >> } >> bar (); >> >> a is not live over the call to bar as it can't validly escape out of >> its scope. > > Yes, this will help exposing opportunities for dead store elimination > due to the anticipation of the dummy store, which would otherwise > missed by dce due to false use from bar. However, to make use of scope > information better for aliasing, flow sensitive analysis is needed -- > consider the following case: > > int *gp; > int foo(..) > { > int local; > > local = 1; // (1) > *gp = ... // (2) > > bar (&local); // (3) > > .. > } > > (1) and (2) is not aliased. More generally for loop: local variable > in loop body scope does not live across iterations, so address > escaping does not 'propagate' via backedge: > > > for ( i = ...; i <..; i++) > { > int local[100]; > > for (j = ...) > { > local[j] = ... // (1) > ... = *global_p; // (2) > } > bar (local); > } > > (1) and (2) are not aliased.
Sure - but that's an orthogonal issue. I merely hope for some DSE opportunities. Richard.