Nathan Lynch <nath...@linux.ibm.com> writes: > Michael Ellerman <m...@ellerman.id.au> writes: >> Daniel Henrique Barboza <danielhb...@gmail.com> writes: >>> This is enough to say that we can't easily see the history behind this >>> comment. >>> I also believe that we're better of without it since it doesn't make sense >>> with the current codebase. >> >> It was added by the original CPU hotplug commit for ppc64:: >> >> https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b >> >> >> The code was fairly similar: >> >> void __cpu_die(unsigned int cpu) >> { >> int tries; >> int cpu_status; >> unsigned int pcpu = get_hard_smp_processor_id(cpu); >> >> for (tries = 0; tries < 5; tries++) { >> cpu_status = query_cpu_stopped(pcpu); >> >> if (cpu_status == 0) >> break; >> set_current_state(TASK_UNINTERRUPTIBLE); >> schedule_timeout(HZ); >> } >> if (cpu_status != 0) { >> printk("Querying DEAD? cpu %i (%i) shows %i\n", >> cpu, pcpu, cpu_status); >> } >> >> /* Isolation and deallocation are definatly done by >> * drslot_chrp_cpu. If they were not they would be >> * done here. Change isolate state to Isolate and >> * change allocation-state to Unusable. >> */ >> paca[cpu].xProcStart = 0; >> >> /* So we can recognize if it fails to come up next time. */ >> cpu_callin_map[cpu] = 0; >> } >> >> >> drslot_chrp_cpu() still exists in drmgr: >> >> >> https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406 >> >> >> I agree the comment is no longer meaningful and can be removed. > > Thanks for providing this background. > >> It might be good to then add a comment explaining why we need to set >> cpu_start = 0. > > Sure, I can take that as a follow-up. Or perhaps it should be moved to > the online path.
Yeah possibly. >> It's not immediately clear why we need to. When we bring a CPU back >> online in smp_pSeries_kick_cpu() we ask RTAS to start it and then >> immediately set cpu_start = 1, ie. there isn't a separate step that sets >> cpu_start = 1 for hotplugged CPUs. > > Hmm I'm not following the distinction you seem to be drawing between > bringing a CPU back online and a hotplugged CPU. kick_cpu is used in all > cases AFAIK. Yeah that wasn't very clear, reading it back I have half confused myself. At boot we need the paca->cpu_start flag because some CPUs can be spinning in generic_secondary_common_init, in head_64.S. ie. they're not in RTAS, they're spinning in kernel code, and the only thing that stops them coming "online" in the Linux sense is paca->cpu_start. You can see that in pseries/smp.c: static inline int smp_startup_cpu(unsigned int lcpu) { ... if (cpumask_test_cpu(lcpu, of_spin_mask)) /* Already started by OF and sitting in spin loop */ return 1; We also hit that case when kexec'ing, where all the secondaries come in that way. On the other hand when we offline a CPU, we set paca->cpu_start = 0, in pseries_cpu_die(), and then we return the CPU to RTAS. The only way it *should* come back online is via smp_pSeries_kick_cpu(), which calls smp_startup_cpu() to bring the CPU out of RTAS, and then smp_pSeries_kick_cpu() immediately sets cpu_start = 1. So the sequence is: CPU goes offline from Linux POV paca->cpu_start = 0; CPU offline in RTAS ... CPU brought out of RTAS paca->cpu_start = 1; CPU comes back online from Linux POV But I guess I kind of answered my own question above, where I said *should*. Clearing paca->cpu_start when we offline the CPU gives us a little bit of backup if the CPU comes out of RTAS unexpectedly. ie. it would then spin on paca->cpu_start, rather than spontaneously coming back into Linux when we weren't expecting it. cheers