On 06/11/2013 12:30 PM, Benjamin Herrenschmidt wrote: > On Thu, 2013-05-16 at 18:20 +0800, Li Zhong wrote: >> It seems following race is possible: >> > > .../... > >> vdso_getcpu_init(); >> #endif >> - notify_cpu_starting(cpu); >> - set_cpu_online(cpu, true); >> /* Update sibling maps */ >> base = cpu_first_thread_sibling(cpu); >> for (i = 0; i < threads_per_core; i++) { >> - if (cpu_is_offline(base + i)) >> + if (cpu_is_offline(base + i) && (cpu != base + i)) >> continue; >> cpumask_set_cpu(cpu, cpu_sibling_mask(base + i)); >> cpumask_set_cpu(base + i, cpu_sibling_mask(cpu)); >> @@ -667,6 +665,10 @@ __cpuinit void start_secondary(void *unused) >> } >> of_node_put(l2_cache); >> >> + smp_wmb(); >> + notify_cpu_starting(cpu); >> + set_cpu_online(cpu, true); >> + > > So we could have an online CPU with an empty sibling mask. Now we can > have a sibling that isn't online ... Is that ok ?
I think it is OK. We do the same thing on x86 as well - we set up the sibling links before calling notify_cpu_starting() and setting the cpu in the cpu_online_mask. In fact, there is even a comment explicitly noting that order: arch/x86/kernel/smpboot.c: 220 /* 221 * This must be done before setting cpu_online_mask 222 * or calling notify_cpu_starting. 223 */ 224 set_cpu_sibling_map(raw_smp_processor_id()); 225 wmb(); 226 227 notify_cpu_starting(cpuid); 228 229 /* 230 * Allow the master to continue. 231 */ 232 cpumask_set_cpu(cpuid, cpu_callin_mask); So I agree with Li Zhong's solution. [Arch-specific CPU hotplug code consolidation efforts such as [1] would have weeded out such nasty bugs.. I guess we should revive that patchset sometime soon.] Regards, Srivatsa S. Bhat [1]. https://lwn.net/Articles/500185/ _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev