On Thu, Feb 01, 2018 at 01:32:30PM +0000, Will Deacon wrote: > On Thu, Feb 01, 2018 at 02:29:09PM +0100, Peter Zijlstra wrote: > > On Thu, Feb 01, 2018 at 09:27:50PM +0900, Stafford Horne wrote: > > > I tried to clarify some of this in the spec v1.2 [0] which help formalize > > > some of > > > the techniques we used for the SMP implementation. Its probably not > > > perfect, > > > but I added a section "10. Multicore support" and tried to clarify some > > > things > > > in section 7 on Atomicity. But it seems I dont cover exactly what are are > > > mentioning here. In general: > > > > > > 1 Secondary cores have memory snooping enabled meaning that any write > > > to a > > > cached address will cause the cache line to be invalidated. > > > 2 l.swa (store atomic word) implies a store buffer flush. > > > > What about l.lwa? Can that observe 'old' values, or rather, miss values > > stuck in a remote store buffer? > > > > This will then cause the first l.swa to fail, which, per the above, > > would then sync things up? Which means you get that one extra > > merry-go-round. > > That's ok from a correctness perspective, though, as long as store buffers > are guaranteed to drain.
Depends a bit if you can build control dependencies off of l.swa succeding or not I think :-) Otherwise you get into that dodgy state you suffer from where bits can leak right through. That is, I was thinking what we need for smp_mb__before_atomic. I could've gotten my brain in a twist or course, which isn't _that_ unusual. I never seem to be able to quite remember the holes you have with ll/sc on arm64 :-)