On Thu, 27 Feb 2025 16:00:08 +0100, "Robin Dapp" wrote: > Hi, > > when merging two vsetvls that both only demand "SEW >= ..." we > use their maximum SEW and keep the LMUL. That may lead to invalid > vector configurations like > e64, mf4. > As we make sure that the SEW requirements overlap we can use the SEW > and LMUL of the configuration with the larger SEW. > > Ma Jin already touched this merge rule some weeks ago and fixed the > ratio calculation (r15-6873). Calculating the ratio from an invalid > SEW/LMUL combination lead to an overflow in the ratio variable, though. > I'd argue the proper fix is to update SEW and LMUL, keeping the ratio > as before. This breaks bug-10.c, though, and I'm not sure what it > really tests. SEW/LMUL actually doesn't change, we just emit a slightly > different vsetvl. Maybe it was reduced too far? Jin, any insight > there? I changed it into a run test for now. > > Regtested on rv64gcv_zvl512b. > > Regards > Robin > > PR target/117955 > > gcc/ChangeLog: > > * config/riscv/riscv-v.cc (calculate_ratio): Use LMUL of vsetvl > with larger SEW. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/bug-10.c: Convert to run test. > * gcc.target/riscv/rvv/base/pr117955.c: New test. > --- > gcc/config/riscv/riscv-vsetvl.cc | 8 +- > .../gcc.target/riscv/rvv/base/bug-10.c | 32 +- > .../gcc.target/riscv/rvv/base/pr117955.c | 827 ++++++++++++++++++ > 3 files changed, 861 insertions(+), 6 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr117955.c > > diff --git a/gcc/config/riscv/riscv-vsetvl.cc > b/gcc/config/riscv/riscv-vsetvl.cc > index 82284624a24..f0165f7b8c8 100644 > --- a/gcc/config/riscv/riscv-vsetvl.cc > +++ b/gcc/config/riscv/riscv-vsetvl.cc > @@ -1729,9 +1729,11 @@ private: > } > inline void use_max_sew (vsetvl_info &prev, const vsetvl_info &next) > { > - int max_sew = MAX (prev.get_sew (), next.get_sew ()); > - prev.set_sew (max_sew); > - prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ())); > + bool prev_sew_larger = prev.get_sew () >= next.get_sew (); > + const vsetvl_info from = prev_sew_larger ? prev : next; > + prev.set_sew (from.get_sew ()); > + prev.set_vlmul (from.get_vlmul ()); > + prev.set_ratio (from.get_ratio ()); > use_min_of_max_sew (prev, next); > } > inline void use_next_sew_lmul (vsetvl_info &prev, const vsetvl_info &next) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c > index af3a8610d63..5f7490e8a3b 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c > @@ -1,14 +1,40 @@ > -/* { dg-do compile { target { rv64 } } } */ > +/* { dg-do run { target { rv64 } } } */ > +/* { dg-require-effective-target rv64 } */ > +/* { dg-require-effective-target riscv_v } */ > /* { dg-options " -march=rv64gcv_zvfh -mabi=lp64d -O2 > --param=vsetvl-strategy=optim -fno-schedule-insns -fno-schedule-insns2 > -fno-schedule-fusion " } */ > > #include <riscv_vector.h> > > void > -foo (uint8_t *ptr, vfloat16m4_t *v1, vuint32m8_t *v2, vuint8m2_t *v3, size_t > vl) > +__attribute__ ((noipa)) > +foo (vfloat16m4_t *v1, vuint32m8_t *v2, vuint8m2_t *v3, size_t vl) > { > *v1 = __riscv_vfmv_s_f_f16m4 (1, vl); > *v2 = __riscv_vmv_s_x_u32m8 (2963090659u, vl); > *v3 = __riscv_vsll_vx_u8m2 (__riscv_vid_v_u8m2 (vl), 2, vl); > }
This patch modifies the sequence: vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,a4,e8,m2,ta,ma to: vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,zero,e8,m2,ta,ma Functionally, there is no difference. However, this change resolves the issue with "e64,mf4", and allows the second vsetvli to omit a4, which is beneficial. > -/* { dg-final { scan-assembler-not {vsetvli.*zero,zero} } }*/ > +int > +main () > +{ > + vfloat16m4_t v1; > + vuint32m8_t v2; > + vuint8m2_t v3; > + int vl = 4; > + foo (&v1, &v2, &v3, vl); > + > + _Float16 val1 = ((_Float16 *)&v1)[0]; > + if (val1 - 1.0000f > 0.00001f) > + __builtin_abort (); > + > + uint32_t val2 = ((uint32_t *)&v2)[0]; > + if (val2 != 2963090659u) > + __builtin_abort (); > + > + for (int i = 0; i < vl; i++) > + { > + uint8_t val = ((uint8_t *)&v3)[i]; > + if (val != i << 2) > + __builtin_abort (); > + } > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr117955.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/pr117955.c > new file mode 100644 > index 00000000000..49ccb6097d0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr117955.c > @@ -0,0 +1,827 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zvfh -O3" } */ Here are three issues with this test case: 1. The test case does not seem to take effect, as it appears to pass both before and after applying the patch for RV64. 2. Since no mabi is specified, it consistently fails for RV32 with the error: "Excess errors: cc1: error: ABI requires '-march=rv32'." 3. The test case seems to contain a lot of unnecessary code; perhaps we can streamline it. Best regards, Jin Ma > +#include <riscv_vector.h> > + > +#define dataLen 100 > +#define isNaNF16UI( a ) (((~(a) & 0x7C00) == 0) && ((a) & 0x03FF)) > +#define isNaNF32UI( a ) (((~(a) & 0x7F800000) == 0) && ((a) & 0x007FFFFF)) > +#define isNaNF64UI( a ) (((~(a) & UINT64_C( 0x7FF0000000000000 )) == 0) && > ((a) & UINT64_C( 0x000FFFFFFFFFFFFF ))) > +typedef _Float16 float16_t; > +typedef float float32_t; > +typedef double float64_t;