On Fri, 28 Feb 2025 12:48:36 +0100, "Robin Dapp" wrote:
> > Okay, let me explain the background of my previous patch.
> >
> > Prior to applying my patch, for the test case bug-10.c (a reduced example 
> > of 
> > a larger program with incorrect runtime results),
> > the vsetvli sequence compiled with --param=vsetvl-strategy=simple was as 
> > follows:
> > 1. vsetvli zero,a4,e16,m4,ta,ma + vsetvli zero,a4,e32,m8,ta,ma + vsetvli 
> > zero,a4,e8,m2,ta,ma
> >
> > The vsetvli sequence compiled with --param=vsetvl-strategy=optim was as 
> > follows:
> > 2. vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,zero,e8,m2,ta,ma >
> > Although vl remains unchanged, the SEW/LMUL ratio in sequence 2 changes, 
> > leading to undefined behavior.
> 
> The only difference I see with your patch vs without is
> 
>     <       vsetvli zero,zero,e8,m2,ta,ma
>     ---
>     >       vsetvli zero,a3,e8,m2,ta,ma
> 
> and we ensure the former doesn't occur in the test.
> 
> But that difference doesn't matter because the ratio is the same before and 
> after.  That's why I'm asking.  bug-10.c as is doesn't test anything 
> reasonable 
> IMHO.  Right, the ratio (or rather the associated LMUL) was wrong but the 
> current test doesn't make sure it isn't.  Can you share the non-reduced (or 
> less reduced) case?

Hi Robin,

I apologize for the delayed response. I spent quite a bit of time trying to 
reproduce
the case, and given the passage of time, it wasn't easy to refine the testing.
Fortunately, you can see the results here.

https://godbolt.org/z/Mc8veW7oT

Using GCC version 14.2.0 should allow you to replicate the issue. If all goes as
expected, you will encounter a "Segmentation fault (core dumped)."
By disassembling the binary, you'll notice the presence of "vsetvli 
zero,zero,e32,m4,ta,ma",
which is where the problem lies, just as I mentioned previously.

Best regards,
Jin Ma

> > /* { dg-do compile { target { rv64 } } } */
> > /* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3" } */
> >
> > #include <riscv_vector.h>
> >
> > _Float16 a (uint64_t);
> > int8_t b () {
> >   int c = 100;
> >   double *d;
> >   _Float16 *e;
> >   for (size_t f;; c -= f)
> >     {
> >       f = c;
> >       __riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 2, f);
> >       vfloat16mf4_t g;
> >       a (1);
> >       g = __riscv_vfmv_s_f_f16mf4 (2, f);
> >       vfloat64m1_t i = __riscv_vfmv_s_f_f64m1 (30491, f);
> >       vuint16mf4_t j;
> >       __riscv_vsoxei16_v_f16mf4 (e, j, g, f);
> >       vuint8mf8_t k = __riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 3, f);
> >       __riscv_vsoxei8_v_f64m1 (d, k, i, f);
> >     }
> > }
> >
> > /* { dg-final { scan-assembler-not "e64,mf4" } } */
> 
> That works, thanks.
> 
> -- 
> Regards
>  Robin

Reply via email to