Sorry.. I see spin_lock is running after preempt_disable. Sorry to make a noise.
On Fri, Jan 15, 2021 at 11:03 AM Yun Levi <ppbuk5...@gmail.com> wrote: > > Hi Peter, Ingo, Will and linux-kernel. > > While I read the code of queued_spin_lock_slowpath function, > I have some questions about an unrelated nesting case when qspinlock is > waiting. > > Suppose there are CPU1 to CPU8. > There are two locks named lock1 and lock2 which are not related to each other. > > At first, Thread 1 got a lock1. > And Thread 2 and Thread 3 are contending to get lock1 and each waiting on > CPU2 and CPU3. > > Next, Thread 5 got a lock2. > And Thread 6 and Thread 7 are contending to get lock2 and each waiting on > CPU6 and CPU7. > > In this situation, Thread 2 consumes all its quantum and switched new > thread named "NTHREAD" > But This NTHREAD tries to get lock2 and recognize someone is waiting on queue, > and try to add itself to queue again. > > My questions are: > 1. When this situation happens, the qnode idx will not be 0, but > it seems to be nesting. > Could this situation happen or just my leak of understanding of code? > > 2. If (1) situation does not happen, why does it not happen? > > 3. If (1) situation can happen, lock contamination could be > happened when more than > three threads waiting, unlocked one of them and another > waiting again on same CPU. > So, how about making them wait as idx >= MAX_NODE when someone > try to queuing, > But it try to lock different from one who queue in waiting other lock? > > My idea is: > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > index b9515fcc9b29..fbb6e2effb59 100644 > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -106,6 +106,7 @@ struct qnode { > * PV doubles the storage and uses the second cacheline for PV state. > */ > static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[MAX_NODES]); > +static DEFINE_PER_CPU(struct qspinlock *, curr_lock); > > /* > * We must be able to distinguish between no-tail and the tail at 0:0, > @@ -315,6 +316,7 @@ static __always_inline u32 > __pv_wait_head_or_lock(struct qspinlock *lock, > void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > { > struct mcs_spinlock *prev, *next, *node; > + struct qspinlock *saved_lock = NULL; > u32 old, tail; > int idx; > > @@ -401,6 +403,13 @@ void queued_spin_lock_slowpath(struct qspinlock > *lock, u32 val) > idx = node->count++; > tail = encode_tail(smp_processor_id(), idx); > > + if (likely(!idx)) { > + *this_cpu_ptr(&curr_lock) = lock; > + saved_lock = lock; > + } else { > + saved_lock = *this_cpu_ptr(&curr_lock); > + } > + > /* > * 4 nodes are allocated based on the assumption that there will > * not be nested NMIs taking spinlocks. That may not be true in > @@ -410,7 +419,7 @@ void queued_spin_lock_slowpath(struct qspinlock > *lock, u32 val) > * any MCS node. This is not the most elegant solution, but is > * simple enough. > */ > - if (unlikely(idx >= MAX_NODES)) { > + if (unlikely((idx >= MAX_NODES) || lock != saved_lock)) { > lockevent_inc(lock_no_node); > while (!queued_spin_trylock(lock)) > cpu_relax(); > @@ -557,7 +566,8 @@ void queued_spin_lock_slowpath(struct qspinlock > *lock, u32 val) > /* > * release the node > */ > - __this_cpu_dec(qnodes[0].mcs.count); > + if (unlikely(saved_lock == lock)) > + __this_cpu_dec(qnodes[0].mcs.count); > } > EXPORT_SYMBOL(queued_spin_lock_slowpath); > > Thank you.