https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110445
Bug ID: 110445 Summary: [14 Regression] FAIL: gcc.dg/vect/slp-46.c with AVX2 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- With AVX2 we fail to SLP double x[1024], y[1024]; void __attribute__((noipa)) foo() { for (int i = 0; i < 512; ++i) { x[2*i] = y[i]; x[2*i+1] = y[i]; } } because we hit the following: /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note: ==> examining statement: _2 = y[i_12]; /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: missed: peeling for gaps insufficient for access /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:12:17: missed: not vectorized: relevant stmt not supported: _2 = y[i_12]; /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note: removing SLP instance operations starting from: x[_1] = _2; /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: missed: unsupported SLP instances /space/rguenther/src/gcc11queue/gcc/testsuite/gcc.dg/vect/slp-46.c:10:21: note: re-trying with SLP disabled the issue is that in the last vector iteration with VF=2 we are accessing { i, i+1, i+2, i+3 }, if we're peeling at least a single scalar iteration we still access possibly one too much elements. The simplest solution would be to access { i, i+1 } only which I think we already can do. The other solution is to peel N scalar iterations or apply masking to not access elements in the gap if the ISA supports that.