On 03/07/2025 10:24 am, Jan Beulich wrote: > On 02.07.2025 16:41, Andrew Cooper wrote: >> With the recent simplifications, it becomes obvious that smp_mb() isn't the >> right barrier; all we need is a compiler barrier. >> >> Include this in monitor() itself, along with an explantion. > Ah, here we go. As per my comment on patch 4, would this perhaps better move > ahead (which however would require a bit of an adjustment to the description)?
As said, it's not necessary in practice. > >> + * monitored cacheline must not be hoisted over MONITOR. >> + */ >> asm volatile ( "monitor" >> - :: "a" (addr), "c" (ecx), "d" (edx) ); >> + :: "a" (addr), "c" (ecx), "d" (edx) : "memory" ); >> } > That's heavier than we need, though. Can't we simply have a fake output > "+m" (irq_stat[cpu])? No. That would be wrong for one of the two callers. Also we don't have cpu available. The correct value would be a round-down on addr and a cacheline-sized sized type, but we can't do that because of -Wvla. Nothing good can come of anything crossing the MONITOR, and ... > Downside being that the compiler may then set up > addressing of that operand, when the operand isn't really referenced. (As > long as __softirq_pending is the first field there, there may not be any > extra overhead, though, as %rax then would also address the unused operand.) ... nothing good is going to come from trying to get clever at optimising a constraint that doesn't actually improve code generation in the first place. > > Yet then, is it really only reads from that cacheline that are of concern? > Isn't it - strictly speaking - also necessary that any (hypothetical) reads > done by the NOW() at the end of the function have to occur only afterwards > (and independent of there being a LOCK-ed access in cpumask_clear_cpu())? The NOW() and cpumask_clear_cpu() are gone, and not going to be returning. I did put a compiler barrier in mwait() originally too, but dropped it because I couldn't reason about it easily. Nothing good can come of having any loads hoisted above MWAIT, but from a programmers point of view, it's indistinguishable from e.g. taking an SMI. If there's a correctness issue, it's not MWAIT's fault. ~Andrew