On 4/26/2018 1:39 AM, Peter Zijlstra wrote:

On Wed, Apr 25, 2018 at 02:03:19PM +0530, Gaurav Kohli wrote:
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 5043e74..c5c5184 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -122,7 +122,45 @@ static int smpboot_thread_fn(void *data)
                }
if (kthread_should_park()) {
+                       /*
+                        * Serialize against wakeup.
                         *
                         * Prior wakeups must complete and later wakeups
                         * will observe TASK_RUNNING.
                         *
                         * This avoids the case where the TASK_RUNNING
                         * store from ttwu() competes with the
                         * TASK_PARKED store from kthread_parkme().
                         *
                         * If the TASK_PARKED store looses that
                         * competition, kthread_unpark() will go wobbly.
+                        */
+                       raw_spin_lock(&current->pi_lock);
                        __set_current_state(TASK_RUNNING);
+                       raw_spin_unlock(&current->pi_lock);
                        preempt_enable();
                        if (ht->park && td->status == HP_THREAD_ACTIVE) {
                                BUG_ON(td->cpu != smp_processor_id());
Does that work for you?

We have given patch for testing, usually it takes around 2-3 days for 
reproduction(we will update for the same).


But looking at this a bit more; don't we have the exact same problem
with the TASK_RUNNING store in the !ht->thread_should_run() case?
Suppose a ttwu() happens concurrently there, it can end up competing
against the TASK_INTERRUPTIBLE store, no?

Of course, that race is not fatal, we'll just end up going around the
loop once again I suppose. Maybe a comment there too?

                        /*
                         * A similar race is possible here, but loosing
                         * the TASK_INTERRUPTIBLE store is harmless and
                         * will make us go around the loop once more.
                         */

Actually instead of race, i am seeing wakeup miss problem which is very rare, 
if we take case of hotplug thread

Controller                                           Hotplug

                                                             Loop start

                                                             
set_current_state(TASK_INTERRUPTIBLE);

                                                             if 
(kthread_should_park()) { -> fails

Set Should_park

then wake_up

                                                            if 
(!ht->thread_should_run(td->cpu)) {

                                                            
preempt_enable_no_resched();

                                                            schedule(); Again 
went to schedule(which is very rare to occur,not sure whether it hits)


And of course, I suspect we actually want to use TASK_IDLE, smpboot
threads don't want signals do they? But that probably ought to be a
separate patch.

Yes I agree, we can control race from here as well,  Please suggest would below 
change be any help here:

 } else {

                        __set_current_state(TASK_RUNNING);

                        preempt_enable();

                        ht->thread_fn(td->cpu);

                       + set_current_state(TASK_INTERRUPTIBLE);

                       + schedule();

                }


--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

Reply via email to