https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113112
--- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Pan Li <pa...@gcc.gnu.org>: https://gcc.gnu.org/g:290230034092898981488d0716ddae43bd36c09f commit r14-6810-g290230034092898981488d0716ddae43bd36c09f Author: Juzhe-Zhong <juzhe.zh...@rivai.ai> Date: Sat Dec 23 07:07:42 2023 +0800 RISC-V: Make PHI initial value occupy live V_REG in dynamic LMUL cost model analysis Consider this following case: foo: ble a0,zero,.L11 lui a2,%hi(.LANCHOR0) addi sp,sp,-128 addi a2,a2,%lo(.LANCHOR0) mv a1,a0 vsetvli a6,zero,e32,m8,ta,ma vid.v v8 vs8r.v v8,0(sp) ---> spill .L3: vl8re32.v v16,0(sp) ---> reload vsetvli a4,a1,e8,m2,ta,ma li a3,0 vsetvli a5,zero,e32,m8,ta,ma vmv8r.v v0,v16 vmv.v.x v8,a4 vmv.v.i v24,0 vadd.vv v8,v16,v8 vmv8r.v v16,v24 vs8r.v v8,0(sp) ---> spill .L4: addiw a3,a3,1 vadd.vv v8,v0,v16 vadd.vi v16,v16,1 vadd.vv v24,v24,v8 bne a0,a3,.L4 vsetvli zero,a4,e32,m8,ta,ma sub a1,a1,a4 vse32.v v24,0(a2) slli a4,a4,2 add a2,a2,a4 bne a1,zero,.L3 li a0,0 addi sp,sp,128 jr ra .L11: li a0,0 ret Pick unexpected LMUL = 8. The root cause is we didn't involve PHI initial value in the dynamic LMUL calculation: # j_17 = PHI <j_11(9), 0(5)> ---> # vect_vec_iv_.8_24 = PHI <_25(9), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }(5)> We didn't count { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } in consuming vector register but it does allocate an vector register group for it. This patch fixes this missing count. Then after this patch we pick up perfect LMUL (LMUL = M4) foo: ble a0,zero,.L9 lui a4,%hi(.LANCHOR0) addi a4,a4,%lo(.LANCHOR0) mv a2,a0 vsetivli zero,16,e32,m4,ta,ma vid.v v20 .L3: vsetvli a3,a2,e8,m1,ta,ma li a5,0 vsetivli zero,16,e32,m4,ta,ma vmv4r.v v16,v20 vmv.v.i v12,0 vmv.v.x v4,a3 vmv4r.v v8,v12 vadd.vv v20,v20,v4 .L4: addiw a5,a5,1 vmv4r.v v4,v8 vadd.vi v8,v8,1 vadd.vv v4,v16,v4 vadd.vv v12,v12,v4 bne a0,a5,.L4 slli a5,a3,2 vsetvli zero,a3,e32,m4,ta,ma sub a2,a2,a3 vse32.v v12,0(a4) add a4,a4,a5 bne a2,zero,.L3 .L9: li a0,0 ret Tested on --with-arch=gcv no regression. PR target/113112 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (max_number_of_live_regs): Refine dump information. (preferred_new_lmul_p): Make PHI initial value into live regs calculation. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113112-1.c: New test.