Hi,

in PR117682 we build an interleaving pattern

  { 1, 201, 209, 25, 161, 105, 113, 185, 65, 9,
    17, 89, 225, 169, 177, 249, 129, 73, 81, 153,
    33, 233, 241, 57, 193, 137, 145, 217, 97, 41,
    49, 121 };

with negative step expecting wraparound semantics due to -fwrapv.

For building interleaved patterns we have an optimization that
does e.g.
  {1, 209, ...} = { 1, 0, 209, 0, ...}
and
  {201, 25, ...} >> 8 = { 0, 201, 0, 25, ...}
and IORs those.

The optimization only works if the lowpart bits are zero.  When
overflowing e.g. with a negative step we cannot guarantee this.

This patch makes us fall back to the generic merge handling for negative
steps.

I'm not 100% certain we're good even for positive steps.  If the
step or the vector length is large enough we'd still overflow and
have non-zero lower bits.  I haven't seen this happen during my
testing, though and the patch doesn't make things worse, so...

Regtested on rv64gcv_zvl512b.  Let's see what the CI says.

Regards
 Robin

        PR target/117682

gcc/ChangeLog:

        * config/riscv/riscv-v.cc (expand_const_vector): Fall back to
        merging if either step is negative.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/pr117682.c: New test.
---
 gcc/config/riscv/riscv-v.cc                       | 11 +++++++++--
 .../gcc.target/riscv/rvv/autovec/pr117682.c       | 15 +++++++++++++++
 2 files changed, 24 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b0de4c52b83..93888c4fac0 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1478,13 +1478,20 @@ expand_const_vector (rtx target, rtx src)
 
             can be interpreted into:
 
-                 EEW = 32, { 2, 4, ... }  */
+                 EEW = 32, { 2, 4, ... }.
+
+            This only works as long as the larger type does not overflow
+            as we can't guarantee a zero value for each second element
+            of the sequence with smaller EEW.
+            ??? For now we assume that no overflow happens with positive
+            steps and forbid negative steps altogether.  */
          unsigned int new_smode_bitsize = builder.inner_bits_size () * 2;
          scalar_int_mode new_smode;
          machine_mode new_mode;
          poly_uint64 new_nunits
            = exact_div (GET_MODE_NUNITS (builder.mode ()), 2);
-         if (int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode)
+         if (known_ge (step1, 0) && known_ge (step2, 0)
+             && int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode)
              && get_vector_mode (new_smode, new_nunits).exists (&new_mode))
            {
              rtx tmp1 = gen_reg_rtx (new_mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c
new file mode 100644
index 00000000000..bbbcfcce626
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v } */
+/* { dg-require-effective-target rvv_zvl256b_ok } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -mrvv-vector-bits=zvl -fwrapv" } */
+
+signed char a = 9;
+int main() {
+  for (char e = 0; e < 20; e++)
+    for (char f = 0; f < 7; f++)
+      a *= 57;
+
+  if (a != 41)
+    __builtin_abort ();
+}
+
-- 
2.47.1

Reply via email to