On 2/24/2025 4:34 PM, Jeff Law wrote:
On 2/24/25 5:07 PM, Edwin Lu wrote:
See [1] thread for original patch which spawned this one.
We are currently seeing the following code where we perform a vsetvl
before a branching instruction against the avl.
vsetvli a5,a1,e32,m1,tu,ma
vle32.v v2,0(a0)
sub a1,a1,a5 <-- a1 potentially set to 0
sh2add a0,a5,a0
vfmacc.vv v1,v2,v2
vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl
to 0
beq a1,zero,.L12 <-- check if avl is 0
Since we are branching off of the avl, we don't need to update vl until
after the branch is taken. Search the ready queue for vsetvls scheduled
before branching instructions that branch off of the same regno and
promote the branches to execute first. This can improve performancy by
potentially avoiding setting VL=0 which may be expensive on some
uarches.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html
PR/117974
gcc/ChangeLog:
* config/riscv/riscv.cc (vsetvl_avl_regno): New helper function.
(insn_increases_zeroness_p): Ditto.
(riscv_promote_ready): Ditto.
(riscv_sched_reorder): Implement hook.
(TARGET_SCHED_REORDER): Define Hook.
* config/riscv/riscv.opt: New flag.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.
So I preferred the earlier approach of disabling speculation of the
vsetvls, though I'm guessing you're looking at this approach because
that was insufficient?
I don't think that it was because it was insufficient, but that it might
be too constraining. I'm not exactly certain if there is the possibility
where speculatively issuing some vsetvls in some situations could
improve perf. I think with this patch, it targets the specific issue
we're facing directly which is updating vl=0 when it's unnecessary. I
don't know which patch would be better so I'm putting this out there as
a potential alternative.
Regardless, like the disabling of speculation I tend to think this
should be controlled by an entry in the tuning structure rather than
flags. Flags are fine for internal testing, but I don't see a
compelling need for a flag to control this in general.
Jeff
That makes sense to me. I completely forgot that we had talked about
putting it into the tuning structure when writing this. I'll update the
tunes respectively in a later version of the patches depending on which
one moves forward.
Edwin