On Sat, May 21, 2016 at 03:49:20PM +0200, Manfred Spraul wrote: > >I'm tempted to put that trailing smp_rmb() in spin_unlock_wait() too; > >because I suspect the netfilter code is broken without it. > > > >And it seems intuitive to assume that if we return from unlock_wait() we > >can indeed observe the critical section we waited on.
> Then !spin_is_locked() and spin_unlock_wait() would be different with > regards to memory barriers. > Would that really help? We could fix that I think; something horrible like: static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { int locked; smp_mb(); locked = atomic_read(&lock->val) & _Q_LOCKED_MASK; smp_acquire__after_ctrl_dep(); return locked; } Which if used in a conditional like: spin_lock(A); if (spin_is_locked(B)) { spin_unlock(A); spin_lock(B); ... } would still provide the ACQUIRE semantics required. The only difference is that it would provide it to _both_ branches, which might be a little more expensive. > My old plan was to document the rules, and define a generic > smp_acquire__after_spin_is_unlocked. > https://lkml.org/lkml/2015/3/1/153 Yeah; I more or less forgot all that. Now, I too think having the thing documented is good; _however_ I also think having primitives that actually do what you assume them to is a good thing. spin_unlock_wait() not actually serializing against the spin_unlock() is really surprising and subtle.