Hi Thomas,

So there have been some reports on hitting:

  BUG_ON(td->cpu != smp_processor_id());

in smpboot_thread_fn.

Now I've been staring at this for a wee bit today and I've found two
issues, but I'm not sure either are enough to explain the observed.

1) smpboot_register_percpu_thread() seems to lack serialization against
   hotplug. It has a for_each_online() loop, but no get_online_cpus() --
   unlike smpboot_unregister_percpu_thread, which does.

   Typical usage like spawn_ksoftirqd() should be fine, they're early
   init calls and those run before we bring up the other CPUs. Therefore
   this does not explain the observation that its ksoftirqd/n triggering
   the BUG.

   However, the usage in proc_dowatchdog() is susceptible to this race
   and its entirely possible to go wrong there.


2) the usage of __set_current_state(TASK_PARKED) in __kthread_parkme()
   is wrong AFAICT, one should always use set_current_state() for
   setting !TASK_RUNNING state. The comment with set_current_state()
   explains why.

   This would've allowed the test_bit(KTHREAD_SHOULD_PARK) load to have
   been satisfied before the store of TASK_PARKED.


In any case, I'm not sure either of these are enough, I'll go stare at
it a bit more I suppose.

---
 kernel/kthread.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 10e489c448fe..9787244d43ec 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -156,12 +156,12 @@ void *probe_kthread_data(struct task_struct *task)
 
 static void __kthread_parkme(struct kthread *self)
 {
-       __set_current_state(TASK_PARKED);
+       set_current_state(TASK_PARKED);
        while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
                if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
                        complete(&self->parked);
                schedule();
-               __set_current_state(TASK_PARKED);
+               set_current_state(TASK_PARKED);
        }
        clear_bit(KTHREAD_IS_PARKED, &self->flags);
        __set_current_state(TASK_RUNNING);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to