On Wed, 12 Feb 2025 16:47:07 PST (-0800), jeffreya...@gmail.com wrote:


On 2/12/25 4:27 PM, Edwin Lu wrote:
The instruction scheduler appears to be speculatively hoisting vsetvl
insns outside of their basic block without checking for data
dependencies. This resulted in a situation where the following occurs

         vsetvli a5,a1,e32,m1,tu,ma
         vle32.v v2,0(a0)
         sub     a1,a1,a5 <-- a1 potentially set to 0
         sh2add  a0,a5,a0
         vfmacc.vv       v1,v2,v2
         vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
         beq     a1,zero,.L12 <-- check if avl is 0

This patch would essentially delay the vsetvl update to after the branch
to prevent unnecessarily updating the vinfo at the end of a basic block.

        PR/117974

gcc/ChangeLog:

        * config/riscv/riscv.cc (riscv_sched_can_speculate_insn):
        (TARGET_SCHED_CAN_SPECULATE_INSN): Implement.
Correct me if I'm wrong, there's not anything inherently wrong with the
speculation from a correctness standpoint.  This is "just" a performance
issue, right?

Hopefully, we didn't go into this looking for functional bugs.

We were seeing some odd behavior from the scheduler: it only moves the first vsetvli before the branch, the second crosses the branch while merging. That tripped up a "maybe there's a functional bug lurking here", but thinking about it again it seems more likely we're just missing an opportunity to schedule that's getting made irrelevant by the vsetvli elimination pass.

And from a performance standpoint speculation of the vsetvl could vary
pretty wildly based on uarch characteristics.   I can easily see cases
where it it wildly bad, wildly good and don't really care.

Point being it seems like it should be controlled by a uarch setting
rather than always avoiding or always enabling.

Ya, very much seems like a uarch thing.

We kind of just threw this one together as we're still experimenting with this. The goal was to avoid the VL=0 cases, but I think that's even just sort of a side effect of avoiding speculative scheduling here. So I think we need to go benchmark this before we can really get a feel for what's going on, as it might not be direct enough to catch the interesting cases.

Other thoughts?

The docs seem to hint TARGET_SCHED_CAN_SPECULATE_INSN is meant for stuff we can't/don't model in the pipeline, but I have no idea how to model the VL=0 case there.

Reply via email to