Hello Michael, On Fri, 2020-03-27 at 14:50 +1100, Michael Ellerman wrote: > Hi Leonardo, > > Leonardo Bras <leona...@linux.ibm.com> writes: > > During a crash, there is chance that the cpus that handle the NMI IPI > > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it > > will cause a deadlock. (rtas_lock and printk logbuf_log as of today) > > Please give us more detail on how those locks are causing you trouble, a > stack trace would be good if you have it.
Sure, I have hit it in printf and rtas_call, as said before. After crash_send_ipi(), it's tested how many cpus_in_crash are there, and once they hit the total value, it's printed "IPI complete". This printk call itself already got stuck in spin_lock, for example. Here are the stack traces: #0 arch_spin_lock #1 do_raw_spin_lock #2 __raw_spin_lock #3 _raw_spin_lock #4 vprintk_emit #5 vprintk_func #7 crash_kexec_prepare_cpus #8 default_machine_crash_shutdown #9 machine_crash_shutdown #10 __crash_kexec #11 crash_kexec #12 oops_end #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end > > This is a problem if the system has kdump set up, given if it crashes > > for any reason kdump may not be saved for crash analysis. > > > > Skip spinlocks after NMI IPI is sent to all other cpus. > > We don't want to add overhead to all spinlocks for the life of the > system, just to handle this one case. I understand. Other than this patch, I would propose doing something uglier, like forcing the said locks to unlocked state when cpus_in_crash hits it's maximum value, before printing "IPI complete". Creating similar functions that don't lock, just for this case, looks like overkill to me. Do you have any other suggestion? > There's already a flag that is set when the system is crashing, > "oops_in_progress", maybe we need to use that somewhere to skip a lock > or do an early return. I think that would not work, because oops_in_progress should be 0 here: oops_end() calls bust_spinlocks(0) before calling crash_kexec(), and bust_spinlocks(0) will decrement oops_in_progress. (just verified, it's 0 before printing "IPI complete"). Thank you the feedback, :) Best regards, Leonardo
signature.asc
Description: This is a digitally signed message part