Re: [PATCH v2] Target-independent store forwarding avoidance.

Jeff Law Wed, 12 Jun 2024 07:19:12 -0700



On 6/12/24 12:47 AM, Richard Biener wrote:


One of the points I wanted to make is that sched1 can make quite a
difference as to the relative distance of the store and load and
we have the instruction window the pass considers when scanning
(possibly driven by target uarch details).  So doing the rewriting
before sched1 might be not ideal (but I don't know how much cleanup
work the pass leaves behind - there's nothing between sched1 and RA).

ACK. I guess I'm just skeptical about much separation we can get inpractice from scheduling.

As far as cleanup opportunity, it likely comes down to how clean theinitial codegen is for the bitfield insertion step.


On the hardware side I always wondered whether a failed load-to-store
forward results in the load uop stalling (because the hardware actually
_did_ see the conflict with an in-flight store) or whether this gets
catched later as the hardware speculates a load from L1 (with the
wrong value) but has to roll back because of the conflict.  I would
imagine the latter is cheaper to implement but worse in case of
conflict.

I wouldn't be surprised to see both approaches being used and I suspectit really depends on how speculative your uarch is. At some pointthere's enough speculation going on that you can't detect the violationearly enough and you have to implement a replay/rollback scheme.


jeff

Re: [PATCH v2] Target-independent store forwarding avoidance.

Reply via email to