On 02.07.2025 16:41, Andrew Cooper wrote: > With the recent simplifications, it becomes obvious that smp_mb() isn't the > right barrier; all we need is a compiler barrier. > > Include this in monitor() itself, along with an explantion.
Ah, here we go. As per my comment on patch 4, would this perhaps better move ahead (which however would require a bit of an adjustment to the description)? (Nit: explanation) > --- a/xen/arch/x86/acpi/cpu_idle.c > +++ b/xen/arch/x86/acpi/cpu_idle.c > @@ -66,8 +66,12 @@ static always_inline void monitor( > alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR, > [addr] "a" (addr)); > > + /* > + * The memory clobber is a compiler barrier. Subseqeunt reads from the Nit: Subsequent > + * monitored cacheline must not be hoisted over MONITOR. > + */ > asm volatile ( "monitor" > - :: "a" (addr), "c" (ecx), "d" (edx) ); > + :: "a" (addr), "c" (ecx), "d" (edx) : "memory" ); > } That's heavier than we need, though. Can't we simply have a fake output "+m" (irq_stat[cpu])? Downside being that the compiler may then set up addressing of that operand, when the operand isn't really referenced. (As long as __softirq_pending is the first field there, there may not be any extra overhead, though, as %rax then would also address the unused operand.) Yet then, is it really only reads from that cacheline that are of concern? Isn't it - strictly speaking - also necessary that any (hypothetical) reads done by the NOW() at the end of the function have to occur only afterwards (and independent of there being a LOCK-ed access in cpumask_clear_cpu())? Jan