https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65930

            Bug ID: 65930
           Summary: Reduction with sign-change not handled
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
            Blocks: 53947
  Target Milestone: ---

Neither

int foo (unsigned int *x)
{
  int sum = 0;
  for (int i = 0; i < 4; ++i)
    sum += x[i*4+0]+ x[i*4 + 1] + x[i*4 + 2] + x[i*4 + 3];
  return sum;
}

nor

int bar (unsigned int *x)
{
  int sum = 0;
  for (int i = 0; i < 16; ++i)
    sum += x[i];
  return sum;
}

are currently vectorized because

t.c:4:3: note: reduction: not commutative/associative: sum_27 = (int) _26;

though the sign of 'sum' vs the sign of 'x' doesn't really matter here.

It works for

unsigned baz (int *x)
{
  unsigned int sum = 0;
  for (int i = 0; i < 16; ++i)
    sum += x[i];
  return sum;
}

because C promotes x[i] instead of sum in this case.

For vectorization we might want to change the add(s) to int - of course
not strictly valid because of undefined overflow issues.

This kind of loop appears in a hot area of x264.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to