On Wed, 12 Feb 2025 16:47:07 PST (-0800), jeffreya...@gmail.com wrote:
On 2/12/25 4:27 PM, Edwin Lu wrote:
The instruction scheduler appears to be speculatively hoisting vsetvl
insns outside of their basic block without checking for data
dependencies. This resulted in a situation where the following occurs
vsetvli a5,a1,e32,m1,tu,ma
vle32.v v2,0(a0)
sub a1,a1,a5 <-- a1 potentially set to 0
sh2add a0,a5,a0
vfmacc.vv v1,v2,v2
vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
beq a1,zero,.L12 <-- check if avl is 0
This patch would essentially delay the vsetvl update to after the branch
to prevent unnecessarily updating the vinfo at the end of a basic block.
PR/117974
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_sched_can_speculate_insn):
(TARGET_SCHED_CAN_SPECULATE_INSN): Implement.
Correct me if I'm wrong, there's not anything inherently wrong with the
speculation from a correctness standpoint. This is "just" a performance
issue, right?
Hopefully, we didn't go into this looking for functional bugs.
We were seeing some odd behavior from the scheduler: it only moves the
first vsetvli before the branch, the second crosses the branch while
merging. That tripped up a "maybe there's a functional bug lurking
here", but thinking about it again it seems more likely we're just
missing an opportunity to schedule that's getting made irrelevant by the
vsetvli elimination pass.
And from a performance standpoint speculation of the vsetvl could vary
pretty wildly based on uarch characteristics. I can easily see cases
where it it wildly bad, wildly good and don't really care.
Point being it seems like it should be controlled by a uarch setting
rather than always avoiding or always enabling.
Ya, very much seems like a uarch thing.
We kind of just threw this one together as we're still experimenting
with this. The goal was to avoid the VL=0 cases, but I think that's
even just sort of a side effect of avoiding speculative scheduling here.
So I think we need to go benchmark this before we can really get a feel
for what's going on, as it might not be direct enough to catch the
interesting cases.
Other thoughts?
The docs seem to hint TARGET_SCHED_CAN_SPECULATE_INSN is meant for stuff
we can't/don't model in the pipeline, but I have no idea how to model
the VL=0 case there.