Hi Tejun, On Thu, Jun 16, 2016 at 03:35:04PM -0400, Tejun Heo wrote: > Hello, > > So, the issue of the initial worker not having its affinity set > correctly wasn't caused by the order of the operations. Reordering > just made set_cpus_allowed tried one more time late enough so that it > hides the race condition most of the time. The problem is that > CPU_ONLINE callbacks are called while the cpu being onlined is online > but not active and select_fallback_rq() only considers active cpus, so > if a kthread gets scheduled in the meantime and it doesn't have any > cpu which is active in its allowed mask, it's allowed mask gets reset > to cpu_possible_mask. > > Would something like the following make sense? > > Thanks. > ------ 8< ------ > Subject: [PATCH] sched: allow kthreads to fallback to online && !active cpus > > During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is > online but not active. A CPU_ONLINE callback may create or bind a > kthread so that its cpus_allowed mask only allows the CPU which is > being brought online. The kthread may start executing before the CPU > is made active and can end up in select_fallback_rq(). > > In such cases, the expected behavior is selecting the CPU which is > coming online; however, because select_fallback_rq() only chooses from > active CPUs, it determines that the task doesn't have any viable CPU > in its allowed mask and ends up overriding it to cpu_possible_mask. > > CPU_ONLINE callbacks should be able to put kthreads on the CPU which > is coming online. Update select_fallback_rq() so that it follows > cpu_online() rather than cpu_active() for kthreads. > > Signed-off-by: Tejun Heo <t...@kernel.org> > Reported-by: Gautham R Shenoy <e...@linux.vnet.ibm.com>
Hi Tejun, This patch fixes the issue on POWER. I am able to see the worker threads of the unbound workqueues of the newly onlined node with this. Tested-by: Gautham R. Shenoy <e...@linux.vnet.ibm.com> > --- > kernel/sched/core.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 017d539..a12e3db 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1536,7 +1536,9 @@ static int select_fallback_rq(int cpu, struct > task_struct *p) > for (;;) { > /* Any allowed, online CPU? */ > for_each_cpu(dest_cpu, tsk_cpus_allowed(p)) { > - if (!cpu_active(dest_cpu)) > + if (!(p->flags & PF_KTHREAD) && !cpu_active(dest_cpu)) > + continue; > + if (!cpu_online(dest_cpu)) > continue; > goto out; > } > _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev