On Mon, Apr 18, 2011 at 5:58 AM, Michael Matz <m...@suse.de> wrote:
>
> Hi,
>
> [FWIW I can't approve patches, but some feedback nevertheless]
>
> On Sun, 17 Apr 2011, Easwaran Raman wrote:
>
> >  This patch impoves the heuristic used in assigning stack location to
> > stack variables.  Currently, if there are 3 variables A, B and C with
> > their sizes in increasing order and A and C have a conflict, it will
> > put A and B in a partition and C in a separate partition with a total
> > frame size of sizeof(B) + sizeof(C).  This patch puts B and C in the
> > same partition and A in a separate partition, with a total size of
> > sizeof(A) + sizeof(C).
>
> That's the change in stack_var_cmp, to match the comment, right?  Makes
> sense.
>
> > The other change in this patch removes the field offset in struct
> > stack_var. Right now, the field is always 0 due to the way it is set in
> > partition_stack_vars.
>
> Huh, indeed, starting with the initial version of that code already.  The
> idea clearly was to pack multiple conflicting small objects into the space
> of only one larger non-conflicting one by placing them at different
> offsets, but that never seems to have been implemented and would require
> different tracking of conflicts.  I think removing the offset tracking
> right now is okay, can be reintroduced once somebody gets motivated
> enough.

Yes, the intent seems to be what you describe above. I haven't thought
about it much, but I am not sure how much a non-backtracking version
will buy over the current heuristic.

>
> > -     and merge them into partition A.  */
> > -  for (last = i = b; i != EOC; last = i, i = stack_vars[i].next)
> > -    {
> > -      stack_vars[i].offset += offset;
> > -      stack_vars[i].representative = a;
> > -    }
> > -  stack_vars[last].next = stack_vars[a].next;
> > +   /* Add B to A's partition.  */
> > +  stack_vars[b].next = stack_vars[a].next;
> > +  stack_vars[b].representative = a;
>
> Hmm.  This seems fishy.  You don't update the representatives of the
> members of the partition that B is the leader of.  Additionally you break
> the chain of B's members.  That is only a problem if B actually has more
> than one member.  That might be always the case with your changed sorting
> order, but it's an important invariant, so please assert this in this
> routine.
>
B always has one member is an invariant in my scheme. The object
corresponding to the outer loop index is always the representative. If
B has to have more than one members, it must have been visited in the
outer loop in some earlier iteration. But then, its index in the
stack_vars_sorted must be greater than the current i. I have added an
assertion than stack_vars[b].next == EOC in union_stack_vars which is
true only when B has one member.

>
> > @@ -596,7 +581,7 @@
> >    if (vb->conflicts)
> >      {
> >        EXECUTE_IF_SET_IN_BITMAP (vb->conflicts, 0, u, bi)
> > -     add_stack_var_conflict (a, stack_vars[u].representative);
> > +     add_stack_var_conflict (a, u);
>
> Please don't.  This uselessly bloats the conflict bitmaps.

It is sufficient to add the conflicts of  a variable only when it is
not merged into some group.  I am adding a check to that effect.
I am attaching a new patch which excludes the strict aliasing check
(which I will send separately) and the changes I have mentioned above.
Bootstraps and no regressions in tests.

Thanks,
Easwaran


>
> Ciao,
> Michael.
Index: gcc/testsuite/gcc.dg/stack-layout-2.c
===================================================================
--- gcc/testsuite/gcc.dg/stack-layout-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/stack-layout-2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+void bar( char *);
+int foo()
+{
+  int i=0;
+  {
+    char a[8000];
+    bar(a);
+    i += a[0];
+  }
+  {
+    char a[8192];
+    char b[32];
+    bar(a);
+    i += a[0];
+    bar(b);
+    i += a[0];
+  }
+  return i;
+}
+/* { dg-final { scan-rtl-dump "size 8192" "expand" } } */
+/* { dg-final { scan-rtl-dump "size 32" "expand" } } */
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	(revision 171954)
+++ gcc/cfgexpand.c	(working copy)
@@ -158,11 +158,6 @@
   /* The Variable.  */
   tree decl;
 
-  /* The offset of the variable.  During partitioning, this is the
-     offset relative to the partition.  After partitioning, this
-     is relative to the stack frame.  */
-  HOST_WIDE_INT offset;
-
   /* Initially, the size of the variable.  Later, the size of the partition,
      if this variable becomes it's partition's representative.  */
   HOST_WIDE_INT size;
@@ -267,7 +262,6 @@
   v = &stack_vars[stack_vars_num];
 
   v->decl = decl;
-  v->offset = 0;
   v->size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1);
   /* Ensure that all variables have size, so that &a != &b for any two
      variables that are simultaneously live.  */
@@ -403,9 +397,9 @@
     return (int)largeb - (int)largea;
 
   /* Secondary compare on size, decreasing  */
-  if (sizea < sizeb)
-    return -1;
   if (sizea > sizeb)
+    return -1;
+  if (sizea < sizeb)
     return 1;
 
   /* Tertiary compare on true alignment, decreasing.  */
@@ -564,28 +558,19 @@
 
 /* A subroutine of partition_stack_vars.  The UNION portion of a UNION/FIND
    partitioning algorithm.  Partitions A and B are known to be non-conflicting.
-   Merge them into a single partition A.
+   Merge them into a single partition A.  */
 
-   At the same time, add OFFSET to all variables in partition B.  At the end
-   of the partitioning process we've have a nice block easy to lay out within
-   the stack frame.  */
-
 static void
-union_stack_vars (size_t a, size_t b, HOST_WIDE_INT offset)
+union_stack_vars (size_t a, size_t b)
 {
-  size_t i, last;
   struct stack_var *vb = &stack_vars[b];
   bitmap_iterator bi;
   unsigned u;
 
-  /* Update each element of partition B with the given offset,
-     and merge them into partition A.  */
-  for (last = i = b; i != EOC; last = i, i = stack_vars[i].next)
-    {
-      stack_vars[i].offset += offset;
-      stack_vars[i].representative = a;
-    }
-  stack_vars[last].next = stack_vars[a].next;
+  gcc_assert (stack_vars[b].next == EOC);
+   /* Add B to A's partition.  */
+  stack_vars[b].next = stack_vars[a].next;
+  stack_vars[b].representative = a;
   stack_vars[a].next = b;
 
   /* Update the required alignment of partition A to account for B.  */
@@ -596,7 +581,8 @@
   if (vb->conflicts)
     {
       EXECUTE_IF_SET_IN_BITMAP (vb->conflicts, 0, u, bi)
-	add_stack_var_conflict (a, stack_vars[u].representative);
+        if (stack_vars[u].next == EOC && stack_vars[u].representative == u)
+          add_stack_var_conflict (a, u);
       BITMAP_FREE (vb->conflicts);
     }
 }
@@ -605,16 +591,13 @@
    partitions constrained by the interference graph.  The overall
    algorithm used is as follows:
 
-	Sort the objects by size.
+	Sort the objects by size in descending order.
 	For each object A {
 	  S = size(A)
 	  O = 0
 	  loop {
 	    Look for the largest non-conflicting object B with size <= S.
 	    UNION (A, B)
-	    offset(B) = O
-	    O += size(B)
-	    S -= size(B)
 	  }
 	}
 */
@@ -636,24 +619,23 @@
   for (si = 0; si < n; ++si)
     {
       size_t i = stack_vars_sorted[si];
-      HOST_WIDE_INT isize = stack_vars[i].size;
       unsigned int ialign = stack_vars[i].alignb;
-      HOST_WIDE_INT offset = 0;
 
-      for (sj = si; sj-- > 0; )
+      /* Ignore objects that aren't partition representatives. If we
+         see a var that is not a partition representative, it must
+         have been merged earlier.  */
+      if (stack_vars[i].representative != i)
+        continue;
+
+      for (sj = si + 1; sj < n; ++sj)
 	{
 	  size_t j = stack_vars_sorted[sj];
-	  HOST_WIDE_INT jsize = stack_vars[j].size;
 	  unsigned int jalign = stack_vars[j].alignb;
 
 	  /* Ignore objects that aren't partition representatives.  */
 	  if (stack_vars[j].representative != j)
 	    continue;
 
-	  /* Ignore objects too large for the remaining space.  */
-	  if (isize < jsize)
-	    continue;
-
 	  /* Ignore conflicting objects.  */
 	  if (stack_var_conflict_p (i, j))
 	    continue;
@@ -664,25 +646,8 @@
 	      != (jalign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT))
 	    continue;
 
-	  /* Refine the remaining space check to include alignment.  */
-	  if (offset & (jalign - 1))
-	    {
-	      HOST_WIDE_INT toff = offset;
-	      toff += jalign - 1;
-	      toff &= -(HOST_WIDE_INT)jalign;
-	      if (isize - (toff - offset) < jsize)
-		continue;
-
-	      isize -= toff - offset;
-	      offset = toff;
-	    }
-
 	  /* UNION the objects, placing J at OFFSET.  */
-	  union_stack_vars (i, j, offset);
-
-	  isize -= jsize;
-	  if (isize == 0)
-	    break;
+	  union_stack_vars (i, j);
 	}
     }
 
@@ -712,9 +677,8 @@
 	{
 	  fputc ('\t', dump_file);
 	  print_generic_expr (dump_file, stack_vars[j].decl, dump_flags);
-	  fprintf (dump_file, ", offset " HOST_WIDE_INT_PRINT_DEC "\n",
-		   stack_vars[j].offset);
 	}
+      fputc ('\n', dump_file);
     }
 }
 
@@ -863,10 +827,9 @@
 	 partition.  */
       for (j = i; j != EOC; j = stack_vars[j].next)
 	{
-	  gcc_assert (stack_vars[j].offset <= stack_vars[i].size);
 	  expand_one_stack_var_at (stack_vars[j].decl,
 				   base, base_align,
-				   stack_vars[j].offset + offset);
+				   offset);
 	}
     }
 

Reply via email to