On 2026/2/2 21:42, Peter Zijlstra wrote:
On Mon, Feb 02, 2026 at 09:23:07PM +0800, Lance Yang wrote:

Hmm... we need MB rather than RMB on the sync side. Is that correct?

Walker:
[W]active_lockless_pt_walk_mm = mm -> MB -> [L]page-tables

Sync:
[W]page-tables -> MB -> [L]active_lockless_pt_walk_mm


This can work -- but only if the walker and sync touch the same
page-table address.

Now, typically I would imagine they both share the p4d/pud address at
the very least, right?

Thanks. I think I see the confusion ...

To be clear, the goal is not to make the walker see page-table writes through the MB pairing, but to wait for any concurrent lockless page table walkers to finish.

The flow is:

1) Page tables are modified
2) TLB flush is done
3) Read active_lockless_pt_walk_mm (with MB to order page-table writes before
   this read) to find which CPUs are locklessly walking this mm
4) IPI those CPUs
5) The IPI forces them to sync, so after the IPI returns, any in-flight lockless page table walk has finished (or will restart and see the new page tables)

The synchronization relies on the IPI to ensure walkers stop before continuing.

I would assume the TLB flush (step 2) should imply some barrier.

Does that clarify?

Reply via email to