On Wed, Sep 26, 2018 at 01:01:19PM +0200, Peter Zijlstra wrote: > While working my way through the code again; I felt the comments could > use help. > > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > --- > kernel/locking/qspinlock.c | 40 ++++++++++++++++++++++++++++------------ > 1 file changed, 28 insertions(+), 12 deletions(-) > > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -326,16 +326,23 @@ void queued_spin_lock_slowpath(struct qs > /* > * trylock || pending > * > - * 0,0,0 -> 0,0,1 ; trylock > - * 0,0,1 -> 0,1,1 ; pending > + * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock > */ > val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); > + > /* > - * If we observe any contention; undo and queue. > + * If we observe contention, there was a concurrent lock.
Nit: I think "concurrent lock" is confusing here, because that implies to me that the lock was actually taken behind our back, which isn't necessarily the case. How about "there is a concurrent locker"? > + * > + * Undo and queue; our setting of PENDING might have made the > + * n,0,0 -> 0,0,0 transition fail and it will now be waiting > + * on @next to become !NULL. > */ Hmm, but it could also fail another concurrent set of PENDING (and the lock could just be held the entire time). > if (unlikely(val & ~_Q_LOCKED_MASK)) { > + > + /* Undo PENDING if we set it. */ > if (!(val & _Q_PENDING_MASK)) > clear_pending(lock); > + > goto queue; > } > > @@ -466,7 +473,7 @@ void queued_spin_lock_slowpath(struct qs > * claim the lock: > * > * n,0,0 -> 0,0,1 : lock, uncontended > - * *,*,0 -> *,*,1 : lock, contended > + * *,0,0 -> *,0,1 : lock, contended Pending can be set behind our back in the contended case, in which case we take the lock with a single byte store and don't clear pending. You mention this in the updated comment below, but I think we should leave this comment alone. Will > * > * If the queue head is the only one in the queue (lock value == tail) > * and nobody is pending, clear the tail code and grab the lock. > @@ -474,16 +481,25 @@ void queued_spin_lock_slowpath(struct qs > */ > > /* > - * In the PV case we might already have _Q_LOCKED_VAL set. > + * In the PV case we might already have _Q_LOCKED_VAL set, because > + * of lock stealing; therefore we must also allow: > * > - * The atomic_cond_read_acquire() call above has provided the > - * necessary acquire semantics required for locking. > - */ > - if (((val & _Q_TAIL_MASK) == tail) && > - atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL)) > - goto release; /* No contention */ > + * n,0,1 -> 0,0,1 > + * > + * Note: at this point: (val & _Q_PENDING_MASK) == 0, because of the > + * above wait condition, therefore any concurrent setting of > + * PENDING will make the uncontended transition fail. > + */ > + if ((val & _Q_TAIL_MASK) == tail) { > + if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL)) > + goto release; /* No contention */ > + } > > - /* Either somebody is queued behind us or _Q_PENDING_VAL is set */ > + /* > + * Either somebody is queued behind us or _Q_PENDING_VAL got set > + * which will then detect the remaining tail and queue behind us > + * ensuring we'll see a @next. > + */ > set_locked(lock); > > /* > >