https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113112

--- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <pa...@gcc.gnu.org>:

https://gcc.gnu.org/g:290230034092898981488d0716ddae43bd36c09f

commit r14-6810-g290230034092898981488d0716ddae43bd36c09f
Author: Juzhe-Zhong <juzhe.zh...@rivai.ai>
Date:   Sat Dec 23 07:07:42 2023 +0800

    RISC-V: Make PHI initial value occupy live V_REG in dynamic LMUL cost model
analysis

    Consider this following case:

    foo:
            ble     a0,zero,.L11
            lui     a2,%hi(.LANCHOR0)
            addi    sp,sp,-128
            addi    a2,a2,%lo(.LANCHOR0)
            mv      a1,a0
            vsetvli a6,zero,e32,m8,ta,ma
            vid.v   v8
            vs8r.v  v8,0(sp)                     ---> spill
    .L3:
            vl8re32.v       v16,0(sp)            ---> reload
            vsetvli a4,a1,e8,m2,ta,ma
            li      a3,0
            vsetvli a5,zero,e32,m8,ta,ma
            vmv8r.v v0,v16
            vmv.v.x v8,a4
            vmv.v.i v24,0
            vadd.vv v8,v16,v8
            vmv8r.v v16,v24
            vs8r.v  v8,0(sp)                    ---> spill
    .L4:
            addiw   a3,a3,1
            vadd.vv v8,v0,v16
            vadd.vi v16,v16,1
            vadd.vv v24,v24,v8
            bne     a0,a3,.L4
            vsetvli zero,a4,e32,m8,ta,ma
            sub     a1,a1,a4
            vse32.v v24,0(a2)
            slli    a4,a4,2
            add     a2,a2,a4
            bne     a1,zero,.L3
            li      a0,0
            addi    sp,sp,128
            jr      ra
    .L11:
            li      a0,0
            ret

    Pick unexpected LMUL = 8.

    The root cause is we didn't involve PHI initial value in the dynamic LMUL
calculation:

      # j_17 = PHI <j_11(9), 0(5)>                       ---> #
vect_vec_iv_.8_24 = PHI <_25(9), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }(5)>

    We didn't count { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } in consuming vector register but it
does allocate an vector register group for it.

    This patch fixes this missing count. Then after this patch we pick up
perfect LMUL (LMUL = M4)

    foo:
            ble     a0,zero,.L9
            lui     a4,%hi(.LANCHOR0)
            addi    a4,a4,%lo(.LANCHOR0)
            mv      a2,a0
            vsetivli        zero,16,e32,m4,ta,ma
            vid.v   v20
    .L3:
            vsetvli a3,a2,e8,m1,ta,ma
            li      a5,0
            vsetivli        zero,16,e32,m4,ta,ma
            vmv4r.v v16,v20
            vmv.v.i v12,0
            vmv.v.x v4,a3
            vmv4r.v v8,v12
            vadd.vv v20,v20,v4
    .L4:
            addiw   a5,a5,1
            vmv4r.v v4,v8
            vadd.vi v8,v8,1
            vadd.vv v4,v16,v4
            vadd.vv v12,v12,v4
            bne     a0,a5,.L4
            slli    a5,a3,2
            vsetvli zero,a3,e32,m4,ta,ma
            sub     a2,a2,a3
            vse32.v v12,0(a4)
            add     a4,a4,a5
            bne     a2,zero,.L3
    .L9:
            li      a0,0
            ret

    Tested on --with-arch=gcv no regression.

            PR target/113112

    gcc/ChangeLog:

            * config/riscv/riscv-vector-costs.cc (max_number_of_live_regs):
Refine dump information.
            (preferred_new_lmul_p): Make PHI initial value into live regs
calculation.

    gcc/testsuite/ChangeLog:

            * gcc.dg/vect/costmodel/riscv/rvv/pr113112-1.c: New test.

Reply via email to