Hi, in PR117682 we build an interleaving pattern
{ 1, 201, 209, 25, 161, 105, 113, 185, 65, 9, 17, 89, 225, 169, 177, 249, 129, 73, 81, 153, 33, 233, 241, 57, 193, 137, 145, 217, 97, 41, 49, 121 }; with negative step expecting wraparound semantics due to -fwrapv. For building interleaved patterns we have an optimization that does e.g. {1, 209, ...} = { 1, 0, 209, 0, ...} and {201, 25, ...} >> 8 = { 0, 201, 0, 25, ...} and IORs those. The optimization only works if the lowpart bits are zero. When overflowing e.g. with a negative step we cannot guarantee this. This patch makes us fall back to the generic merge handling for negative steps. I'm not 100% certain we're good even for positive steps. If the step or the vector length is large enough we'd still overflow and have non-zero lower bits. I haven't seen this happen during my testing, though and the patch doesn't make things worse, so... Regtested on rv64gcv_zvl512b. Let's see what the CI says. Regards Robin PR target/117682 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fall back to merging if either step is negative. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr117682.c: New test. --- gcc/config/riscv/riscv-v.cc | 11 +++++++++-- .../gcc.target/riscv/rvv/autovec/pr117682.c | 15 +++++++++++++++ 2 files changed, 24 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index b0de4c52b83..93888c4fac0 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1478,13 +1478,20 @@ expand_const_vector (rtx target, rtx src) can be interpreted into: - EEW = 32, { 2, 4, ... } */ + EEW = 32, { 2, 4, ... }. + + This only works as long as the larger type does not overflow + as we can't guarantee a zero value for each second element + of the sequence with smaller EEW. + ??? For now we assume that no overflow happens with positive + steps and forbid negative steps altogether. */ unsigned int new_smode_bitsize = builder.inner_bits_size () * 2; scalar_int_mode new_smode; machine_mode new_mode; poly_uint64 new_nunits = exact_div (GET_MODE_NUNITS (builder.mode ()), 2); - if (int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode) + if (known_ge (step1, 0) && known_ge (step2, 0) + && int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode) && get_vector_mode (new_smode, new_nunits).exists (&new_mode)) { rtx tmp1 = gen_reg_rtx (new_mode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c new file mode 100644 index 00000000000..bbbcfcce626 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117682.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-require-effective-target riscv_v } */ +/* { dg-require-effective-target rvv_zvl256b_ok } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -mrvv-vector-bits=zvl -fwrapv" } */ + +signed char a = 9; +int main() { + for (char e = 0; e < 20; e++) + for (char f = 0; f < 7; f++) + a *= 57; + + if (a != 41) + __builtin_abort (); +} + -- 2.47.1