On 12/26/2016 12:50 AM, Boqun Feng wrote: > Hi Wainman, > > On Sun, Dec 25, 2016 at 03:26:01PM -0500, Waiman Long wrote: >> A number of cmpxchg calls in qspinlock_paravirt.h were replaced by more >> relaxed versions to improve performance on architectures that use LL/SC. >> >> All the locking related cmpxchg's are replaced with the _acquire >> variants: >> - pv_queued_spin_steal_lock() >> - trylock_clear_pending() >> >> The cmpxchg's related to hashing are replaced by either by the _release >> or the _relaxed variants. See the inline comment for details. >> >> Signed-off-by: Waiman Long <long...@redhat.com> >> >> v1->v2: >> - Add comments in changelog and code for the rationale of the change. >> >> --- >> kernel/locking/qspinlock_paravirt.h | 50 >> ++++++++++++++++++++++++------------- >> 1 file changed, 33 insertions(+), 17 deletions(-) >> >> diff --git a/kernel/locking/qspinlock_paravirt.h >> b/kernel/locking/qspinlock_paravirt.h >> index e3b5520..c31d1ab 100644 >> --- a/kernel/locking/qspinlock_paravirt.h >> +++ b/kernel/locking/qspinlock_paravirt.h >> @@ -72,7 +72,7 @@ static inline bool pv_queued_spin_steal_lock(struct >> qspinlock *lock) >> struct __qspinlock *l = (void *)lock; >> >> if (!(atomic_read(&lock->val) & _Q_LOCKED_PENDING_MASK) && >> - (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) { >> + (cmpxchg_acquire(&l->locked, 0, _Q_LOCKED_VAL) == 0)) { >> qstat_inc(qstat_pv_lock_stealing, true); >> return true; >> } >> @@ -101,16 +101,16 @@ static __always_inline void clear_pending(struct >> qspinlock *lock) >> >> /* >> * The pending bit check in pv_queued_spin_steal_lock() isn't a memory >> - * barrier. Therefore, an atomic cmpxchg() is used to acquire the lock >> - * just to be sure that it will get it. >> + * barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the >> + * lock to provide the proper memory barrier. >> */ >> static __always_inline int trylock_clear_pending(struct qspinlock *lock) >> { >> struct __qspinlock *l = (void *)lock; >> >> return !READ_ONCE(l->locked) && >> - (cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL) >> - == _Q_PENDING_VAL); >> + (cmpxchg_acquire(&l->locked_pending, _Q_PENDING_VAL, >> + _Q_LOCKED_VAL) == _Q_PENDING_VAL); >> } >> #else /* _Q_PENDING_BITS == 8 */ >> static __always_inline void set_pending(struct qspinlock *lock) >> @@ -138,7 +138,7 @@ static __always_inline int trylock_clear_pending(struct >> qspinlock *lock) >> */ >> old = val; >> new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL; >> - val = atomic_cmpxchg(&lock->val, old, new); >> + val = atomic_cmpxchg_acquire(&lock->val, old, new); >> >> if (val == old) >> return 1; >> @@ -209,9 +209,15 @@ static struct qspinlock **pv_hash(struct qspinlock >> *lock, struct pv_node *node) >> struct pv_hash_entry *he; >> int hopcnt = 0; >> >> + /* >> + * Synchronizing with the node state variable will control who does >> + * the hashing - the lock holder or lock waiter. The control >> + * dependency will ensure that node value is written after the lock >> + * value. So we don't need other ordering guarantee. >> + */ > By this comment, you mean that > > cmpxchg_relaxed(&he->lock, NULL, lock); > r1 = ll he->lock; > <compare part> > sc he->lock, lock // successed > > if (r1) > WRITE_ONCE(he->node, node); > > > the sc and WRITE_ONCE() can not be reordered because of the control > dependency? I dont think this is true. Yes the sc must execute before > the WRITE_ONCE(), but the memory/cache effects may be reordered. IOW, > the following may happen > > > CPU 0 CPU 1 > =================== ======================= > {x = 0, y = 0} if (!cmpxchg_relaxed(&y, 0, 1)) > WRITE_ONCE(x, 1); > r1 = READ_ONCE(x); > > smp_rmb(); > > r2 = READ_ONCE(y); > > The following result is possible: > > y = 1 && r1 = 1 && r2 = 0 > > Or I'm missing your point here? ;-) > > Regards, > Boqun > You are probably right. I know the code is somewhat risky. That is why I am waiting for expert like you to see if this is really the case. Now it seems that it may not be the case. I will revise the patch to take that out.
Cheers, Longman