https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116974

            Bug ID: 116974
           Summary: omp inscan reduction not supported with SLP
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

When forcing SLP vectorization with --param vect-force-slp=1 it shows that
vectorizable_scan_store isn't yet supported for SLP.  This results in

FAIL: g++.dg/vect/simd-10.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-2.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-3.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-4.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-5.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-6.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-7.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-8.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2
FAIL: g++.dg/vect/simd-9.cc  -std=c++17  scan-tree-dump-times vect "vectorized
[1-3] loops" 2

but note that for example gcc.dg/vect/vect-simd-8.c also behaves differently,

/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:17:5:
optimized: loop vectorized using 16 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:15:11:
optimized: loop vectorized using 32 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:15:11:
optimized: loop vectorized using 16 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:30:5:
optimized: loop vectorized using 16 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:28:11:
optimized: loop vectorized using 32 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:28:11:
optimized: loop vectorized using 16 byte vectors

vs.

/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:15:11:
optimized: loop vectorized using 32 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:15:11:
optimized: loop vectorized using 16 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:28:11:
optimized: loop vectorized using 32 byte vectors
/home/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-simd-8.c:28:11:
optimized: loop vectorized using 16 byte vectors

note the lack of vectorization of the main loop (we only vectorize the
initialization loop).  The testcase doesn't notice because it scans for
"vectorized \[1-3] loops"

Reply via email to