On Tue, Jul 17, 2018 at 11:44 AM Linus Torvalds <torva...@linux-foundation.org> wrote: > > (a) lwsync is a memory barrier for all the "easy" cases (ie > load->store, load->load, and store->load).
That last one should have been "store->store", of course. So 'lwsync' gives smp_rmb(), smp_wmb(), and smp_load_acquire() semantics (which are the usual "no barrier needed at all" suspects for things like x86). What lwsync lacks is store->load ordering. So: > (b) lwsync is *not* a memory barrier for the store->load case. BUT, this is where isync comes in: > (c) isync *is* (when in that *sequence*) a memory barrier for a > store->load case (and has to be: loads inside a spinlocked region MUST > NOT be done earlier than stores outside of it!). which is why I think that a spinlock implementation that uses isync would give us the semantics we want, without the use of the crazy expensive 'sync' that Michael tested (and which apparently gets horrible 10% scheduler performance regressions at least on some powerpc CPU's). Linus