"Li, Pan2" <pan2...@intel.com> writes: > Hi Richard Sandiford, > > Just tried the overloaded constant divisors with below print div, it works as > you mentioned, ๐! > > printf (" can_div_away_from_zero_p (mode_precision[E_%smode], " > "BITS_PER_UNIT, &mode_size[E_%smode]);\n", m->name, m->name); > > template<unsigned int N, typename Ca, typename Cb, typename Cq> > inline typename if_nonpoly<Cb, bool>::type > can_div_away_from_zero_p (const poly_int_pod<N, Ca> &a, > Cb b, > poly_int_pod<N, Cq> *quotient) > { > if (!can_div_trunc_p (a, b, quotient)) > return false; > if (maybe_ne (*quotient * b, a)) > for (unsigned int i = 0; i < N; ++i) > quotient->coeffs[i] += (quotient->coeffs[i] < 0 ? -1 : 1); > return true; > } > > But I may have a question about the one case as below. > > Assume: > a = [4, 4], b = 8. > > When meet can_div_trunc_p, it will check if the reminder is constant or not, > aka a.coeffs[i] % 8 == 0 (i >= 1). If not constant reminder, the > can_div_trunc_p will do nothing about quotient and return false. > > Thus, when a = [4, 4] for can_div_away_from_zero_p, the output *quotient will > be unchanged, aka the mod_size[E_%smode] will be unchanged for this case. > However, the underlying mode_size will adjust it to the real byte size, and I > am not sure if it is by design or requires additional handling.
Is it right that, for RVV, a load or store of [4,4] will access [8,8] bits, even when that means accessing fully-unused bytes? E.g. 4+4X when X=3 would be 16 bits/2 bytes of useful data, but a bitsize of 8+8X would be 32 bits/4 bytes. So a store of [8,8] for a precision of [4,4] would store 2 bytes beyond the end of the useful data when X==3? Richard > Pan > > From: ็ผ ๆ <incarnation.p....@outlook.com> > Sent: Tuesday, February 28, 2023 5:59 PM > To: Richard Sandiford <richard.sandif...@arm.com>; Li, Pan2 > <pan2...@intel.com> > Cc: incarnation.p.lee--- via Gcc-patches <gcc-patches@gcc.gnu.org>; > juzhe.zh...@rivai.ai; kito.ch...@sifive.com; rguent...@suse.de > Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment > > Understood, thanks for the explanations and suggestions. Let me have a try > and keep you posted. > > Pan > ________________________________ > From: Richard Sandiford > <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>> > Sent: Tuesday, February 28, 2023 17:50 > To: Li, Pan2 <pan2...@intel.com<mailto:pan2...@intel.com>> > Cc: ็ผ ๆ > <incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com>>; > incarnation.p.lee--- via Gcc-patches > <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; > juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai> > <juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>>; > kito.ch...@sifive.com<mailto:kito.ch...@sifive.com> > <kito.ch...@sifive.com<mailto:kito.ch...@sifive.com>>; > rguent...@suse.de<mailto:rguent...@suse.de> > <rguent...@suse.de<mailto:rguent...@suse.de>> > Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment > > "Li, Pan2" <pan2...@intel.com<mailto:pan2...@intel.com>> writes: >> Hi Richard Sandiford, >> >> After some investigation, I am not sure if it is possible to make it general >> without any changes to exact_div. We can add one method like below to get >> the unit poly for all possible N. >> >> template<unsigned int N, typename Ca> >> inline POLY_CONST_RESULT (N, Ca, Ca) >> normalize_to_unit (const poly_int_pod<N, Ca> &a) >> { >> typedef POLY_CONST_COEFF (Ca, Ca) C; >> >> poly_int<N, C> normalized = a; >> >> if (normalized.is_constant()) >> normalized.coeffs[0] = 1; >> else >> for (unsigned int i = 0; i < N; i++) >> POLY_SET_COEFF (C, normalized, i, 1); >> >> return normalized; >> } >> >> And then adjust the genmodes like below to consume the unit poly. >> >> printf (" poly_uint16 unit_poly = " >> "normalize_to_unit (mode_precision[E_%smode]);\n", m->name); >> printf (" if (known_lt (mode_precision[E_%smode], " >> "unit_poly * BITS_PER_UNIT))\n", m->name); >> printf (" mode_size[E_%smode] = unit_poly;\n", m->name); >> >> I am not sure if it is a good idea to introduce above normalize code into >> exact_div. Given the comment of the exact_div indicates that โ/* Return A / >> B, given that A is known to be a multiple of B. */โ. > > My point was that we have multiple ways of dividing poly_ints: > > - exact_div, for when the caller knows that the result is always exact > - can_div_trunc_p, for truncating division (round towards 0) > - can_div_away_from_zero_p, for rounding away from 0 > - ... > > This is like how we have multiple division *_EXPRs on trees. > > Until now, exact_div was the correct choice for modes because vector > modes didn't have padding. We're now changing that, so my suggestion > in the review was to change the division operation that we use. > Rather than use exact_div, we should now use can_div_away_from_zero_p, > which would have the effect of rounding the quotient up. > > Something like: > > if (!can_div_away_from_zero_p (mode_precision[E_%smode], BITS_PER_UNIT, > &mode_size[E_%smode])) > gcc_unreachable (); > > But this will require a new overload of can_div_away_from_zero_p, since > the existing one is for constant quotients rather than constant divisors. > > Thanks, > Richard > >> >> Could you please help to share your opinion about this from the expertโs >> perspective ? Thank you! >> >> Pan >> >> From: ็ผ ๆ >> <incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com>> >> Sent: Monday, February 27, 2023 11:13 PM >> To: Richard Sandiford >> <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>>; >> incarnation.p.lee--- via Gcc-patches >> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>> >> Cc: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>; >> kito.ch...@sifive.com<mailto:kito.ch...@sifive.com>; >> rguent...@suse.de<mailto:rguent...@suse.de>; Li, Pan2 >> <pan2...@intel.com<mailto:pan2...@intel.com>> >> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment >> >> Never mind, wish you have a good holiday. >> >> Thanks for pointing this out, the if part cannot take care of poly_int with >> N > 2. As I understand, we need to make it general for all the N of poly_int. >> >> Thus I would like to double confirm with you about how to make it general. I >> suppose there will be a new function can_div_away_from_zero_p to replace the >> if (known_lt(,)) part in genmodes.cc, and leave exact_div unchanged(consider >> the word exact, I suppose we should not touch here), right? Then we still >> need one poly_int with all 1 for N as the return if can_div_away_from_zero_p >> is true. >> >> Thanks again for your professional suggestion, have a nice day, ๐! >> >> Pan >> ________________________________ >> From: Richard Sandiford >> <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com<mailto:richard.sandif...@arm.com%3cmailto:richard.sandif...@arm.com>>> >> Sent: Monday, February 27, 2023 22:24 >> To: incarnation.p.lee--- via Gcc-patches >> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org%3cmailto:gcc-patches@gcc.gnu.org>>> >> Cc: >> incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com%3cmailto:incarnation.p....@outlook.com>> >> >> <incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com<mailto:incarnation.p....@outlook.com%3cmailto:incarnation.p....@outlook.com>>>; >> >> juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai%3cmailto:juzhe.zh...@rivai.ai>> >> >> <juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai%3cmailto:juzhe.zh...@rivai.ai>>>; >> >> kito.ch...@sifive.com<mailto:kito.ch...@sifive.com<mailto:kito.ch...@sifive.com%3cmailto:kito.ch...@sifive.com>> >> >> <kito.ch...@sifive.com<mailto:kito.ch...@sifive.com<mailto:kito.ch...@sifive.com%3cmailto:kito.ch...@sifive.com>>>; >> rguent...@suse.de<mailto:rguent...@suse.de> >> <rguent...@suse.de<mailto:rguent...@suse.de<mailto:rguent...@suse.de%3cmailto:rguent...@suse.de>>>; >> pan2...@intel.com<mailto:pan2...@intel.com> >> <pan2...@intel.com<mailto:pan2...@intel.com<mailto:pan2...@intel.com%3cmailto:pan2...@intel.com>>> >> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment >> >> Sorry for the slow reply, been away for a couple of weeks. >> >> "incarnation.p.lee--- via Gcc-patches" >> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org%3cmailto:gcc-patches@gcc.gnu.org>>> >> writes: >>> From: Pan Li >>> <pan2...@intel.com<mailto:pan2...@intel.com<mailto:pan2...@intel.com%3cmailto:pan2...@intel.com>>> >>> >>> Fix the bug of the rvv bool mode precision with the adjustment. >>> The bits size of vbool*_t will be adjusted to >>> [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The >>> adjusted mode precison of vbool*_t will help underlying pass to >>> make the right decision for both the correctness and optimization. >>> >>> Given below sample code: >>> void test_1(int8_t * restrict in, int8_t * restrict out) >>> { >>> vbool8_t v2 = *(vbool8_t*)in; >>> vbool16_t v5 = *(vbool16_t*)in; >>> *(vbool16_t*)(out + 200) = v5; >>> *(vbool8_t*)(out + 100) = v2; >>> } >>> >>> Before the precision adjustment: >>> addi a4,a1,100 >>> vsetvli a5,zero,e8,m1,ta,ma >>> addi a1,a1,200 >>> vlm.v v24,0(a0) >>> vsm.v v24,0(a4) >>> // Need one vsetvli and vlm.v for correctness here. >>> vsm.v v24,0(a1) >>> >>> After the precision adjustment: >>> csrr t0,vlenb >>> slli t1,t0,1 >>> csrr a3,vlenb >>> sub sp,sp,t1 >>> slli a4,a3,1 >>> add a4,a4,sp >>> sub a3,a4,a3 >>> vsetvli a5,zero,e8,m1,ta,ma >>> addi a2,a1,200 >>> vlm.v v24,0(a0) >>> vsm.v v24,0(a3) >>> addi a1,a1,100 >>> vsetvli a4,zero,e8,mf2,ta,ma >>> csrr t0,vlenb >>> vlm.v v25,0(a3) >>> vsm.v v25,0(a2) >>> slli t1,t0,1 >>> vsetvli a5,zero,e8,m1,ta,ma >>> vsm.v v24,0(a1) >>> add sp,sp,t1 >>> jr ra >>> >>> However, there may be some optimization opportunates after >>> the mode precision adjustment. It can be token care of in >>> the RISC-V backend in the underlying separted PR(s). >>> >>> PR 108185 >>> PR 108654 >>> >>> gcc/ChangeLog: >>> >>> * config/riscv/riscv-modes.def (ADJUST_PRECISION): >>> * config/riscv/riscv.cc (riscv_v_adjust_precision): >>> * config/riscv/riscv.h (riscv_v_adjust_precision): >>> * genmodes.cc (ADJUST_PRECISION): >>> (emit_mode_adjustments): >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/riscv/pr108185-1.c: New test. >>> * gcc.target/riscv/pr108185-2.c: New test. >>> * gcc.target/riscv/pr108185-3.c: New test. >>> * gcc.target/riscv/pr108185-4.c: New test. >>> * gcc.target/riscv/pr108185-5.c: New test. >>> * gcc.target/riscv/pr108185-6.c: New test. >>> * gcc.target/riscv/pr108185-7.c: New test. >>> * gcc.target/riscv/pr108185-8.c: New test. >>> >>> Signed-off-by: Pan Li >>> <pan2...@intel.com<mailto:pan2...@intel.com<mailto:pan2...@intel.com%3cmailto:pan2...@intel.com>>> >>> --- >>> gcc/config/riscv/riscv-modes.def | 8 +++ >>> gcc/config/riscv/riscv.cc | 12 ++++ >>> gcc/config/riscv/riscv.h | 1 + >>> gcc/genmodes.cc | 25 ++++++- >>> gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 ++++++++++++++++++ >>> gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77 +++++++++++++++++++++ >>> 12 files changed, 598 insertions(+), 1 deletion(-) >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-1.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-3.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-4.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-5.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-6.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-7.c >>> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-8.c >>> >>> diff --git a/gcc/config/riscv/riscv-modes.def >>> b/gcc/config/riscv/riscv-modes.def >>> index d5305efa8a6..110bddce851 100644 >>> --- a/gcc/config/riscv/riscv-modes.def >>> +++ b/gcc/config/riscv/riscv-modes.def >>> @@ -72,6 +72,14 @@ ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * >>> riscv_bytes_per_vector_chunk); >>> ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * >>> riscv_bytes_per_vector_chunk); >>> ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8)); >>> >>> +ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1)); >>> +ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2)); >>> +ADJUST_PRECISION (VNx4BI, riscv_v_adjust_precision (VNx4BImode, 4)); >>> +ADJUST_PRECISION (VNx8BI, riscv_v_adjust_precision (VNx8BImode, 8)); >>> +ADJUST_PRECISION (VNx16BI, riscv_v_adjust_precision (VNx16BImode, 16)); >>> +ADJUST_PRECISION (VNx32BI, riscv_v_adjust_precision (VNx32BImode, 32)); >>> +ADJUST_PRECISION (VNx64BI, riscv_v_adjust_precision (VNx64BImode, 64)); >>> + >>> /* >>> | Mode | MIN_VLEN=32 | MIN_VLEN=32 | MIN_VLEN=64 | MIN_VLEN=64 | >>> | | LMUL | SEW/LMUL | LMUL | SEW/LMUL | >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >>> index de3e1f903c7..cbe66c0e35b 100644 >>> --- a/gcc/config/riscv/riscv.cc >>> +++ b/gcc/config/riscv/riscv.cc >>> @@ -1003,6 +1003,18 @@ riscv_v_adjust_nunits (machine_mode mode, int scale) >>> return scale; >>> } >>> >>> +/* Call from ADJUST_PRECISION in riscv-modes.def. Return the correct >>> + PRECISION size for corresponding machine_mode. */ >>> + >>> +poly_int64 >>> +riscv_v_adjust_precision (machine_mode mode, int scale) >>> +{ >>> + if (riscv_v_ext_vector_mode_p (mode)) >>> + return riscv_vector_chunks * scale; >>> + >>> + return scale; >>> +} >>> + >>> /* Return true if X is a valid address for machine mode MODE. If it is, >>> fill in INFO appropriately. STRICT_P is true if REG_OK_STRICT is in >>> effect. */ >>> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h >>> index 5bc7f2f467d..15b9317a8ce 100644 >>> --- a/gcc/config/riscv/riscv.h >>> +++ b/gcc/config/riscv/riscv.h >>> @@ -1025,6 +1025,7 @@ extern unsigned riscv_stack_boundary; >>> extern unsigned riscv_bytes_per_vector_chunk; >>> extern poly_uint16 riscv_vector_chunks; >>> extern poly_int64 riscv_v_adjust_nunits (enum machine_mode, int); >>> +extern poly_int64 riscv_v_adjust_precision (enum machine_mode, int); >>> /* The number of bits and bytes in a RVV vector. */ >>> #define BITS_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * >>> riscv_bytes_per_vector_chunk * 8)) >>> #define BYTES_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * >>> riscv_bytes_per_vector_chunk)) >>> diff --git a/gcc/genmodes.cc b/gcc/genmodes.cc >>> index 2d418f09aab..12f4e6335e6 100644 >>> --- a/gcc/genmodes.cc >>> +++ b/gcc/genmodes.cc >>> @@ -114,6 +114,7 @@ static struct mode_adjust *adj_alignment; >>> static struct mode_adjust *adj_format; >>> static struct mode_adjust *adj_ibit; >>> static struct mode_adjust *adj_fbit; >>> +static struct mode_adjust *adj_precision; >>> >>> /* Mode class operations. */ >>> static enum mode_class >>> @@ -819,6 +820,7 @@ make_vector_mode (enum mode_class bclass, >>> #define ADJUST_NUNITS(M, X) _ADD_ADJUST (nunits, M, X, RANDOM, RANDOM) >>> #define ADJUST_BYTESIZE(M, X) _ADD_ADJUST (bytesize, M, X, RANDOM, RANDOM) >>> #define ADJUST_ALIGNMENT(M, X) _ADD_ADJUST (alignment, M, X, RANDOM, >>> RANDOM) >>> +#define ADJUST_PRECISION(M, X) _ADD_ADJUST (precision, M, X, RANDOM, >>> RANDOM) >>> #define ADJUST_FLOAT_FORMAT(M, X) _ADD_ADJUST (format, M, X, FLOAT, >>> FLOAT) >>> #define ADJUST_IBIT(M, X) _ADD_ADJUST (ibit, M, X, ACCUM, UACCUM) >>> #define ADJUST_FBIT(M, X) _ADD_ADJUST (fbit, M, X, FRACT, UACCUM) >>> @@ -1829,7 +1831,15 @@ emit_mode_adjustments (void) >>> " (mode_precision[E_%smode], mode_nunits[E_%smode]);\n", >>> m->name, m->name); >>> printf (" mode_precision[E_%smode] = ps * old_factor;\n", >>> m->name); >>> - printf (" mode_size[E_%smode] = exact_div >>> (mode_precision[E_%smode]," >>> + /* Normalize the size to 1 if precison is less than BITS_PER_UNIT. >>> */ >>> + printf (" poly_uint16 size_one = " >>> + "mode_precision[E_%smode].is_constant ()\n", m->name); >>> + printf (" ? poly_uint16 (1, 0) : poly_uint16 (1, 1);\n"); >> >> Have you tried this on an x86_64 system? I wouldn't expect it to work >> because of the: >> >> STATIC_ASSERT (N >= 2); >> >> in the poly_uint16 constructor. >> >>> + printf (" if (known_lt (mode_precision[E_%smode], " >>> + "size_one * BITS_PER_UNIT))\n", m->name); >>> + printf (" mode_size[E_%smode] = size_one;\n", m->name); >>> + printf (" else\n"); >>> + printf (" mode_size[E_%smode] = exact_div >>> (mode_precision[E_%smode]," >> >> Now that the assert implicit in the original exact_div no longer holds, >> I think we should instead generalise it to can_div_away_from_zero_p >> (which will involve defining a new overload of can_div_away_from_zero_p). >> I think that will give the same result as the code above for the cases >> that the code above handles. But it should be more general too. >> >> TBH, I'm still sceptical that this is all that is needed. It seems >> unlikely that we've been so good at writing vector support code that >> we've made it work for precision < bitsize, despite that being an >> unsupported combination until now. But I guess we can fix problems >> on a case-by-case basis. >> >> Thanks, >> Richard >> >>> " BITS_PER_UNIT);\n", m->name, m->name); >>> printf (" mode_nunits[E_%smode] = ps;\n", m->name); >>> printf (" adjust_mode_mask (E_%smode);\n", m->name); >>> @@ -1963,6 +1973,19 @@ emit_mode_adjustments (void) >>> printf ("\n /* %s:%d */\n REAL_MODE_FORMAT (E_%smode) = %s;\n", >>> a->file, a->line, a->mode->name, a->adjustment); >>> >>> + /* Adjust precision to the actual bits size. */ >>> + for (a = adj_precision; a; a = a->next) >>> + switch (a->mode->cl) >>> + { >>> + case MODE_VECTOR_BOOL: >>> + printf ("\n /* %s:%d. */\n ps = %s;\n", a->file, a->line, >>> + a->adjustment); >>> + printf (" mode_precision[E_%smode] = ps;\n", a->mode->name); >>> + break; >>> + default: >>> + break; >>> + } >>> + >>> puts ("}"); >>> } >>> >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-1.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-1.c >>> new file mode 100644 >>> index 00000000000..e70960c5b6d >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-1.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool1_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool1_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool1_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool1_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool1_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool1_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 18 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-2.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-2.c >>> new file mode 100644 >>> index 00000000000..dcc7a644a88 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-2.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool2_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 17 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-3.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-3.c >>> new file mode 100644 >>> index 00000000000..3af0513e006 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-3.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool4_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 16 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-4.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-4.c >>> new file mode 100644 >>> index 00000000000..ea3c360d756 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-4.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool8_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-5.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-5.c >>> new file mode 100644 >>> index 00000000000..9fc659d2402 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-5.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool16_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-6.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-6.c >>> new file mode 100644 >>> index 00000000000..98275e5267d >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-6.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool32_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 13 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-7.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-7.c >>> new file mode 100644 >>> index 00000000000..8f6f0b11f09 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-7.c >>> @@ -0,0 +1,68 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool64_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 6 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */ >>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-8.c >>> b/gcc/testsuite/gcc.target/riscv/pr108185-8.c >>> new file mode 100644 >>> index 00000000000..d96959dd064 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-8.c >>> @@ -0,0 +1,77 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ >>> + >>> +#include "riscv_vector.h" >>> + >>> +void >>> +test_vbool1_then_vbool1(int8_t * restrict in, int8_t * restrict out) { >>> + vbool1_t v1 = *(vbool1_t*)in; >>> + vbool1_t v2 = *(vbool1_t*)in; >>> + >>> + *(vbool1_t*)(out + 100) = v1; >>> + *(vbool1_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool2_then_vbool2(int8_t * restrict in, int8_t * restrict out) { >>> + vbool2_t v1 = *(vbool2_t*)in; >>> + vbool2_t v2 = *(vbool2_t*)in; >>> + >>> + *(vbool2_t*)(out + 100) = v1; >>> + *(vbool2_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool4_then_vbool4(int8_t * restrict in, int8_t * restrict out) { >>> + vbool4_t v1 = *(vbool4_t*)in; >>> + vbool4_t v2 = *(vbool4_t*)in; >>> + >>> + *(vbool4_t*)(out + 100) = v1; >>> + *(vbool4_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool8_then_vbool8(int8_t * restrict in, int8_t * restrict out) { >>> + vbool8_t v1 = *(vbool8_t*)in; >>> + vbool8_t v2 = *(vbool8_t*)in; >>> + >>> + *(vbool8_t*)(out + 100) = v1; >>> + *(vbool8_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool16_then_vbool16(int8_t * restrict in, int8_t * restrict out) { >>> + vbool16_t v1 = *(vbool16_t*)in; >>> + vbool16_t v2 = *(vbool16_t*)in; >>> + >>> + *(vbool16_t*)(out + 100) = v1; >>> + *(vbool16_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool32_then_vbool32(int8_t * restrict in, int8_t * restrict out) { >>> + vbool32_t v1 = *(vbool32_t*)in; >>> + vbool32_t v2 = *(vbool32_t*)in; >>> + >>> + *(vbool32_t*)(out + 100) = v1; >>> + *(vbool32_t*)(out + 200) = v2; >>> +} >>> + >>> +void >>> +test_vbool64_then_vbool64(int8_t * restrict in, int8_t * restrict out) { >>> + vbool64_t v1 = *(vbool64_t*)in; >>> + vbool64_t v2 = *(vbool64_t*)in; >>> + >>> + *(vbool64_t*)(out + 100) = v1; >>> + *(vbool64_t*)(out + 200) = v2; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 7 } } */ >>> +/* { dg-final { scan-assembler-times >>> {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */