Hi,

PR112694 shows that we try to create sub-vectors of single-element
vectors because can_duplicate_and_interleave_p returns true.
The problem resurfaced in PR116611.

This patch makes can_duplicate_and_interleave_p return false
if count / nvectors > 0 and removes the corresponding check in the riscv
backend.

This partially gets rid of the FAIL in slp-19a.c.  At least when built
with cost model we don't have LOAD_LANES anymore.  Without cost model,
as in the test suite, we choose a different path and still end up with
LOAD_LANES.

Bootstrapped and regtested on x86 and power10, regtested on
rv64gcv_zvfh_zvbb.  Still waiting for the aarch64 results.

Regards
 Robin

gcc/ChangeLog:

        PR target/112694
        PR target/116611.

        * config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
        return.
        * tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
        false when we cannot create sub-elements.
---
 gcc/config/riscv/riscv-v.cc | 9 ---------
 gcc/tree-vect-slp.cc        | 4 ++++
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 9b6c3a21e2d..5c5ed63d22e 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3709,15 +3709,6 @@ expand_vec_perm_const (machine_mode vmode, machine_mode 
op_mode, rtx target,
      mask to do the iteration loop control. Just disable it directly.  */
   if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL)
     return false;
-  /* FIXME: Explicitly disable VLA interleave SLP vectorization when we
-     may encounter ICE for poly size (1, 1) vectors in loop vectorizer.
-     Ideally, middle-end loop vectorizer should be able to disable it
-     itself, We can remove the codes here when middle-end code is able
-     to disable VLA SLP vectorization for poly size (1, 1) VF.  */
-  if (!BYTES_PER_RISCV_VECTOR.is_constant ()
-      && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL,
-                  poly_int64 (16, 16)))
-    return false;
 
   struct expand_vec_perm_d d;
 
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3d2973698e2..17b59870c69 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -434,6 +434,10 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned 
int count,
   unsigned int nvectors = 1;
   for (;;)
     {
+      /* We need to be able to to fuse COUNT / NVECTORS elements together,
+        so no point in continuing if there are none.  */
+      if (nvectors > count)
+       return false;
       scalar_int_mode int_mode;
       poly_int64 elt_bits = elt_bytes * BITS_PER_UNIT;
       if (int_mode_for_size (elt_bits, 1).exists (&int_mode))
-- 
2.46.0

Reply via email to