https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112092
--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> --- To demonstrate the idea, here is a simple example to make you easier understand the idea: https://godbolt.org/z/Gxzjv48Ec #include "riscv_vector.h" void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int cond, int avl) { size_t vl = __riscv_vsetvl_e16mf2(avl >> 2); vint32m1_t a = __riscv_vle32_v_i32m1(in1, vl); vint32m1_t b = __riscv_vle32_v_i32m1_tu(a, in2, vl); vint32m1_t c = __riscv_vle32_v_i32m1_tu(b, in3, vl); __riscv_vse32_v_i32m1(out, c, vl); } LLVM: srai a4, a6, 2 vsetvli zero, a4, e16, mf2, ta, ma vle32.v v8, (a0) vsetvli zero, zero, e32, m1, tu, ma vle32.v v8, (a1) vle32.v v8, (a2) vse32.v v8, (a3) ret LLVM is generating the naive code according to the intrinsics, as you said, the first vsetvli keep e16mf2 unchanged. Here is the codgen of GCC: GCC: srai a6,a6,2 vsetvli a6,a6,e32,m1,tu,ma vle32.v v1,0(a0) vle32.v v1,0(a1) vle32.v v1,0(a2) vse32.v v1,0(a3) ret since e16 mf2 is same ratio e32 m1, so we change first vsetvl from e16 mf2 into e32 m1 TU. Then we can eliminate the second vsetvl That is we call "local fusion" here. For the case you mentioned is "global fusion" But they are the same thing. Fuse vsetvl according to RVV ISA. So, the example you mention, GCC is generating correct codes.