http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082

             Bug #: 52082
           Summary: Memory loads not rematerialized
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, ra
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: ja...@gcc.gnu.org
                CC: vmaka...@gcc.gnu.org
            Target: x86_64-linux


On the following testcase at -O2 (distilled from genautomata.c):

struct S { unsigned long *s1; struct S *s2; };
int v1 __attribute__((visibility ("hidden")));
struct T
{
  int a, b, c;
} *v2 __attribute__((visibility ("hidden")));
struct S **v3 __attribute__((visibility ("hidden")));
struct S **v4 __attribute__((visibility ("hidden")));

int __attribute__((noinline, noclone))
foo (unsigned long *x, unsigned long *y, int z)
{
  int j, k, l;
  unsigned int i;
  struct S *m;

  for (j = 0; j < v1; j++)
    if (y[j])
      for (i = 0; i < 8 * sizeof (unsigned long); i++)
  if ((y[j] >> i) & 1)
    {
      k = j * 8 * sizeof (unsigned long) + i;
      if (k >= v2->c)
        break;
      for (m = (z ? v4 [k] : v3 [k]); m != ((void *)0); m = m->s2)
        {
          for (l = 0; l < v1; l++)
            if ((x [l] & m->s1 [l]) != m->s1 [l] && m->s1 [l])
              break;
          if (l >= v1)
            return 0;
        }
    }
  return 1;
}

tree LIM moves the loads from v2/v3/v4 before the loop, but unfortunately the
register pressure is high and the pseudos holding the v3/v4 pointers don't get
a a hard register and are immediately spilled to the stack.  I wonder whether
we couldn't instead just rematerialize them and put the original MEM loads into
the loop (assuming they don't alias with anything on the way, but that must be
the case here when LIM moved them there first, after all this loop doesn't have
any MEM stores at all).

Reply via email to