On Wed, Aug 09, 2017 at 05:15:33PM +0200, Peter Zijlstra wrote: > On Wed, Aug 09, 2017 at 05:06:03PM +0200, Peter Zijlstra wrote: > > Now, ARM64 for instance plays funny games, it does something along the > > lines of: > > > > cmpxchg(ptr, old, new) > > { > > do { > > r = LL(ptr); > > if (r != old) > > return r; /* no barriers */ > > r = new > > } while (SC_release(ptr, r)); > > smp_mb(); > > return r; > > } > > > > Thereby ordering things relative to the store on ptr, but the load can > > very much escape. The thinking is that if success, we must observe the > > latest value of ptr, but even in that case the load is not ordered and > > could happen before. > > > > However, since we're guaranteed to observe the latest value of ptr (on > > success) it doesn't matter if we reordered the load, there is no newer > > value possible. > > > > So heaps of tricky, but correct afaict. Will? > > And could not PPC do something similar: > > cmpxchg(ptr, old, new) > { > lwsync(); > dp { > r = LL(ptr); > if (r != old) > return; > r = new; > } while (SC(ptr, r)); > sync(); > return r; > } > > ? > > the lwsync would make it store-release on SC with similar reasoning as > above. > > And lwsync allows 'stores reordered after loads', which allows the prior > smp_store_release() to leak past. > > Or is the reason this doesn't work on PPC that its RCpc?
Here is an example why PPC needs a sync() before the cmpxchg(): https://marc.info/?l=linux-kernel&m=144485396224519&w=2 and Paul Mckenney's detailed explanation about why this could happen: https://marc.info/?l=linux-kernel&m=144485909826241&w=2 (Somehow, I feel like he was answering to a similar question question as you ask here ;-)) And I think aarch64 doesn't have a problem here because it is "(other) multi-copy atomic". Will? Regards, Boqun
signature.asc
Description: PGP signature