On Thu, Oct 10, 2019 at 02:13:47PM +0200, Manfred Spraul wrote: > Hi Peter, > > On 10/10/19 1:42 PM, Peter Zijlstra wrote: > > On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote: > > > Hi, > > > > > > Waiman Long noticed that the memory barriers in sem_lock() are not really > > > documented, and while adding documentation, I ended up with one case where > > > I'm not certain about the wake_q code: > > > > > > Questions: > > > - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an > > > ordering guarantee? > > Yep. Either the atomic instruction implies ordering (eg. x86 LOCK > > prefix) or it doesn't (most RISC LL/SC), if it does, > > smp_mb__{before,after}_atomic() are a NO-OP and the ordering is > > unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are > > unconditional barriers. > > And _relaxed() differs from "normal" cmpxchg only for LL/SC architectures, > correct?
Indeed. > Therefore smp_mb__{before,after}_atomic() may be combined with > cmpxchg_relaxed, to form a full memory barrier, on all archs. Just so. > > > - Is it ok that wake_up_q just writes wake_q->next, shouldn't > > > smp_store_acquire() be used? I.e.: guarantee that wake_up_process() > > > happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed > > > provides any ordering. > > There is no such thing as store_acquire, it is either load_acquire or > > store_release. But just like how we can write load-aquire like > > load+smp_mb(), so too I suppose we could write store-acquire like > > store+smp_mb(), and that is exactly what is there (through the implied > > barrier of wake_up_process()). > > Thanks for confirming my assumption: > The code is correct, due to the implied barrier inside wake_up_process(). It has a comment there, trying to state this. task->wake_q.next = NULL; /* * wake_up_process() executes a full barrier, which pairs with * the queueing in wake_q_add() so as not to miss wakeups. */ wake_up_process(task);