On Thu, 15 Oct 2015 21:00:58 +0200 Laurent Vivier <lviv...@redhat.com> wrote:
> On kexec, all secondary offline CPUs are onlined before > starting the new kernel, this is not done in the case of kdump. > > If kdump is configured and a kernel crash occurs whereas > some secondaries CPUs are offline (SMT=off), > the new kernel is not able to start them and displays some > "Processor X is stuck.". > > Starting with POWER8, subcore logic relies on all threads of > core being booted. So, on startup kernel tries to start all > threads, and asks OPAL (or RTAS) to start all CPUs (including > threads). If a CPU has been offlined by the previous kernel, > it has not been returned to OPAL, and thus OPAL cannot restart > it: this CPU has been lost... > > Signed-off-by: Laurent Vivier <lviv...@redhat.com> Nice analysis of the problem. But, I'm a bit uneasy about this approach to fixing it: Onlining potentially hundreds of CPU threads seems like a risky operation in a kernel that's already crashed. I don't have a terribly clear idea of what is the best way to address this. Here's a few ideas in the right general direction: * I'm already looking into a kdump userspace fixes to stop it attempting to bring up secondary CPUs * A working kernel option to say "only allow this many online cpus ever" which we could pass to the kdump kernel would be nice * Paulus had an idea about offline threads returning themselves directly to OPAL by kicking a flag at kdump/kexec time. BenH, Paulus, OPAL <-> kernel cpu transitions don't seem to work quite how I thought they would. IIUC there's a register we can use to directly control which threads on a core are active. Given that I would have thought cpu "ownership" OPAL vs. kernel would be on a per-core, rather than per-thread basis. Is there some way we can change the CPU onlining / offlining code so that if threads aren't in OPAL, we directly enable them, rather than just hoping they're in a nap loop somewhere? -- David Gibson <dgib...@redhat.com> Senior Software Engineer, Virtualization, Red Hat
pgp9V7t6haiTA.pgp
Description: OpenPGP digital signature
_______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev