https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110445

            Bug ID: 110445
           Summary: [14 Regression] FAIL: gcc.dg/vect/slp-46.c with AVX2
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: testsuite
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

With AVX2 we fail to SLP

double x[1024], y[1024];

void __attribute__((noipa)) foo()
{
  for (int i = 0; i < 512; ++i)
    {
      x[2*i] = y[i];
      x[2*i+1] = y[i];
    }
}

because we hit the following:

/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note:
  ==> examining statement: _2 = y[i_12];
/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21:
missed:   peeling for gaps insufficient for access
/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:12:17:
missed:   not vectorized: relevant stmt not supported: _2 = y[i_12];
/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note:
  removing SLP instance operations starting from: x[_1] = _2;
/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21:
missed:  unsupported SLP instances
/space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note:
 re-trying with SLP disabled

the issue is that in the last vector iteration with VF=2 we are accessing
{ i, i+1, i+2, i+3 }, if we're peeling at least a single scalar iteration
we still access possibly one too much elements.

The simplest solution would be to access { i, i+1 } only which I think
we already can do.  The other solution is to peel N scalar
iterations or apply masking to not access elements in the gap if the ISA
supports that.

Reply via email to