On Mon, Sep 14, 2015 at 01:35:20PM +0200, Peter Zijlstra wrote: > > Sorry for being tardy, I had a wee spell of feeling horrible and then I > procrastinated longer than I should have. > > On Fri, Sep 11, 2015 at 01:45:07PM +0100, Will Deacon wrote: > > > Peter, any thoughts? I'm not au fait with the x86 memory model, but what > > Paul's saying is worrying. > > Right, so Paul is right -- and I completely forgot (I used to know about > that). > > So all the TSO archs (SPARC-TSO, x86 (!OOSTORE) and s390) can do > smp_load_acquire()/smp_store_release() with just barrier(), and while: > > smp_store_release(&x); > smp_load_acquire(&x); > > will provide full order by means of the address dependency, > > smp_store_release(&x); > smp_load_acquire(&y); > > will not. Because the one reorder TSO allows is exactly that one. > > > Peter -- if the above reordering can happen on x86, then moving away > > from RCpc is going to be less popular than I hoped... > > Sadly yes.. We could of course try and split LOCK from ACQUIRE again, > but I'm not sure that's going to help anything except confusion.
This of course also means we need something like: smp_mb__release_acquire() which cannot be a no-op for TSO archs. And it might even mean it needs to be the same as smp_mb__unlock_lock(), but I need to think more on this. The scenario is: CPU0 CPU1 unlock(x) smp_store_release(&x->lock, 0); unlock(y) smp_store_release(&next->lock, 1); /* next == &y */ lock(y) while (!(smp_load_acquire(&y->lock)) cpu_relax(); Where the lock does _NOT_ issue a store to acquire the lock at all. Now I don't think any of our current primitives manage this, so we should be good, but it might just be possible. And at the same time; having both: smp_mb__release_acquire() smp_mb__unlock_lock() is quite horrible, for it clearly shows a LOCK isn't quite the same as ACQUIRE :/ _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev