Re: [PATCH] RISC-V: Prevent speculative vsetvl insn scheduling

Palmer Dabbelt Thu, 13 Feb 2025 08:50:46 -0800

On Thu, 13 Feb 2025 07:38:13 PST (-0800), jeffreya...@gmail.com wrote:



On 2/13/25 8:19 AM, Robin Dapp wrote:

The vsevl pass is LCM based.  So it's not allowed to add a vsetvl on a
path that didn't have a vsetvl before.  Consider this simple graph.

      0
     / \
    2-->3

If we have need for a vsetvl in bb2, but not bb0 or bb3, then the vsetvl
will land in bb4.  bb0 is not a valid insertion point for the vsetvl
pass because the path 0->3 doesn't strictly need a vsetvl.  That's
inherent in the LCM algorithm (anticipatable).


Yeah, I remember the same issue with the rounding-mode setter placement.

Yes.  For VXRM placement, under the right circumstances we pretend there
is a need for the VXRM state at the first instruction in the first BB.
That enables very aggressive hoisting by LCM in those limited cases.


Wouldn't that be fixable by requiring a dummy/wildcard/dontcare vsetvl in bb3
(or any other block that doesn't require one)?  Such a dummy vsetvl would be
fusible with every other vsetvl.  If there are dummy vsetvls remaining after
LCM just delete them?

Just thinking out loud, the devil will be in the details.

But in Vineet's case they want to avoid speculation as that can result
in a vl=0 case.  If we had a dummy fusible vsetvl in bb3, then that
would allow movement into bb0 which is undesirable.

Ya, I think we confused everyone because there's really twovsetvli/branch movement things we've been talking about and they're kindof the opposite.

There's the issue this patch works around, where we found some vsetvliinstances that set VL=0 in unrolled loops. That makes some of ourhardware people upset. Turns out the reduced test case has the branchesto early-out of the unrolled loop when VL would be 0, so just banningvsetvli speculation fixes the issue. It's kind of a indirect way tosolve a uarch-specific problem, so who knows if it'll be worth doing.

Then there's the vsetvli loop-invarint hoisting / vector tail generationthing we were talking about in the meeting this week. Having thevsetvli in the loop made a different subset of our hardware people upset.That's kind of the opposite optimization, though we'd want to avoid theVL=0 case.They're both "Vineet's bug", the hardware people tend to call Vineetwhen they get upset ;)

WRT a question Palmer asked earlier in the thread.  I went back and
reviewed the code/docs around the hook Edwin is using.  My reading is a
bit different and that what Edwin is doing is perfectly fine.

Awesome, thanks. So I think if this is sane enough to run experimentswe can at least try that out and see what happens.

Jeff

Re: [PATCH] RISC-V: Prevent speculative vsetvl insn scheduling

Reply via email to