On Mon, 30 Jan 2012, Richard Guenther wrote:

> On Fri, 27 Jan 2012, Richard Guenther wrote:
> 
> > 
> > This fixes PR51528 by hiding the issue that SRA generates
> > copy in/out with a type not suitable for preserving the data
> > (any non-mode-precision thing).  It fixes it by instead of
> > emitting
> > 
> >   x$i_8 = x.i;
> >   D.1720 = x;
> >   D.1720.i = x$i_8;
> > 
> > for an assign to an unscalarized D.1720 from a partly scalarized x
> > (so the aggregate assignment prevails) emit
> > 
> >   x.i = x$i_8;
> >   D.1720 = x;
> > 
> > instead.  This allows the unfortunate copy in/out to be optimized
> > away (and thus its undefined behavior).
> > 
> > Basically whenever the aggregate copy will prevail prefer to
> > restore the RHS and reload the LHS components - just like we
> > do for aggregate uses/destinations of call statements.
> > 
> > I think we can still clean up the aggregate copy path a lot,
> > but that's for 4.8 (unless new bugs pop up).
> > 
> > Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> 
> So it turns out this wasn't that easy and I botched up some
> of the intricate execution flow in sra_modify_assign.  As a
> preparation for a smaller fix the following patch refactors
> this function to be easier to understand and hopefully make
> the real fix obvious in what it does.

And this is the "real" fix - well, avoid the copy in/out we
currently generate which we are not able to optimize and instead
emit sth we are able to remove the redundant (and harmful) stores.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-01-30  Richard Guenther  <rguent...@suse.de>

        PR tree-optimization/51528
        * tree-sra.c (sra_modify_assign): Avoid copy-in/out for aggregate
        assigns.

        * gcc.dg/torture/pr51528.c: New testcase.

Index: gcc/tree-sra.c
===================================================================
*** gcc/tree-sra.c      (revision 183720)
--- gcc/tree-sra.c      (working copy)
*************** sra_modify_assign (gimple *stmt, gimple_
*** 3135,3143 ****
              sra_stats.deleted++;
              return SRA_AM_REMOVED;
            }
          if (access_has_children_p (racc))
!           generate_subtree_copies (racc->first_child, lhs, racc->offset,
!                                    0, 0, gsi, false, true, loc);
          if (access_has_children_p (lacc))
            generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
                                     0, 0, gsi, true, true, loc);
--- 3135,3148 ----
              sra_stats.deleted++;
              return SRA_AM_REMOVED;
            }
+         /* Restore the aggregate RHS from its components so the
+            prevailing aggregate copy does the right thing.  */
          if (access_has_children_p (racc))
!           generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
!                                    gsi, false, false, loc);
!         /* Re-load the components of the aggregate copy destination.
!            But use the RHS aggregate to load from to expose more
!            optimization opportunities.  */
          if (access_has_children_p (lacc))
            generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
                                     0, 0, gsi, true, true, loc);
Index: gcc/testsuite/gcc.dg/torture/pr51528.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr51528.c      (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr51528.c      (revision 0)
***************
*** 0 ****
--- 1,46 ----
+ /* { dg-do run } */
+ /* { dg-options "-fno-early-inlining" } */
+ 
+ extern void abort (void);
+ 
+ union U
+ {
+   int i;
+   _Bool b;
+ };
+ 
+ _Bool gb;
+ 
+ void  __attribute__ ((noinline))
+ use_bool (union U u)
+ {
+   gb = u.b;
+ }
+ 
+ union U
+ bar (void)
+ {
+   union U u;
+   u.i = 0xFFFE;
+   return u;
+ }
+ 
+ union U  __attribute__ ((noinline))
+ foo (void)
+ {
+   union U u,v;
+ 
+   u.b = 1;
+   use_bool (u);
+   u = bar ();
+ 
+   return u;
+ }
+ 
+ int main (int argc, char **argv)
+ {
+   union U u = foo ();
+   if (u.i != 0xFFFE)
+     abort ();
+   return 0;
+ }

Reply via email to