On Wed, 13 Dec 2017 20:51:01 +1000 Nicholas Piggin <npig...@gmail.com> wrote:
> This is looking pretty nice now... > > On Wed, 13 Dec 2017 19:08:28 +1100 > Balbir Singh <bsinghar...@gmail.com> wrote: > > > @@ -543,7 +543,25 @@ void smp_send_debugger_break(void) > > #ifdef CONFIG_KEXEC_CORE > > void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)) > > { > > + int cpu; > > + > > smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_ipi_callback, 1000000); > > + if (kdump_in_progress() && crash_wake_offline) { > > + for_each_present_cpu(cpu) { > > + if (cpu_online(cpu)) > > + continue; > > + /* > > + * crash_ipi_callback will wait for > > + * all cpus, including offline CPUs. > > + * We don't care about nmi_ipi_function. > > + * Offline cpus will jump straight into > > + * crash_ipi_callback, we can skip the > > + * entire NMI dance and waiting for > > + * cpus to clear pending mask, etc. > > + */ > > + do_smp_send_nmi_ipi(cpu); > > Still a little bit concerned about using NMI IPI for this. > OK -- for offline CPUs you mean? > If you take an NMI IPI from stop, the idle code should do the > right thing and we would just return the system reset wakeup > reason in SRR1 here (which does not need to be cleared). > > If you take the system reset anywhere else in the loop, it's > going to go out via system_reset_exception. I guess that > would end up doing the right thing, it probably gets to > crash_ipi_callback from crash_kexec_secondary? You mean like if we are online at the time of NMI'ing? If so the original loop will NMI us back into crash_ipi_callback anyway. We don't expect this to occur for offline CPUs > > It's just going to be a very untested code path :( What we > gain I suppose is better ability to handle a CPU that's locked > up somewhere in the cpu offline path. Assuming the uncommon > case works... > > Actually, if you *always* go via the system reset exception > handler, then code paths will be shared. That might be the > way to go. So I would check for system reset wakeup SRR1 reason > and call replay_system_reset() for it. What do you think? > We could do that, but that would call pnv_system_reset_exception and try to call the NMI function, but we've not used that path to initiate the NMI, so it should call the stale nmi_ipi_function which is crash_ipi_callback and not go via the crash_kexec path. I can't call smp_send_nmi_ipi due to the nmi_ipi_busy_count and I'm worried about calling a stale nmi_ipi_function via the system_reset_exception path, if we are OK with it, I can revisit the code path Thanks, Balbir Singh.