In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would show only the panic'ing CPU as online. Sample output of crash-utility with such vmcore:
# crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To avoid vmcore from showing only one CPU as online in panic path, skip marking non panic'ing CPUs as offline while stopping them during fadump crash. Signed-off-by: Hari Bathini <hbath...@linux.ibm.com> --- arch/powerpc/kernel/smp.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index c23ee842c4c3..20555d5d5966 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -61,6 +61,7 @@ #include <asm/cpu_has_feature.h> #include <asm/ftrace.h> #include <asm/kup.h> +#include <asm/fadump.h> #ifdef DEBUG #include <asm/udbg.h> @@ -626,7 +627,8 @@ static void nmi_stop_this_cpu(struct pt_regs *regs) /* * IRQs are already hard disabled by the smp_handle_nmi_ipi. */ - set_cpu_online(smp_processor_id(), false); + if (!(oops_in_progress && should_fadump_crash())) + set_cpu_online(smp_processor_id(), false); spin_begin(); while (1) @@ -650,7 +652,8 @@ static void stop_this_cpu(void *dummy) * to know other CPUs are offline before it breaks locks to flush * printk buffers, in case we panic()ed while holding the lock. */ - set_cpu_online(smp_processor_id(), false); + if (!(oops_in_progress && should_fadump_crash())) + set_cpu_online(smp_processor_id(), false); spin_begin(); while (1) -- 2.31.1