On Tue, Feb 9, 2021 at 5:19 PM Petr Mladek <pmla...@suse.com> wrote: > > On Sat 2021-02-06 13:41:24, Muchun Song wrote: > > We found a deadlock bug on our server when the kernel panic. It can be > > described in the following diagram. > > > > CPU0: CPU1: > > panic rcu_dump_cpu_stacks > > kdump_nmi_shootdown_cpus > > nmi_trigger_cpumask_backtrace > > register_nmi_handler(crash_nmi_callback) printk_safe_flush > > __printk_safe_flush > > > > raw_spin_lock_irqsave(&read_lock) > > // send NMI to other processors > > apic_send_IPI_allbutself(NMI_VECTOR) > > // NMI interrupt, > > dead loop > > crash_nmi_callback > > printk_safe_flush_on_panic > > printk_safe_flush > > __printk_safe_flush > > // deadlock > > raw_spin_lock_irqsave(&read_lock) > > > > The register_nmi_handler() can be called in the __crash_kexec() or the > > crash_smp_send_stop() on the x86-64. Because CPU1 is interrupted by the > > NMI with holding the read_lock and crash_nmi_callback() never returns, > > CPU0 can deadlock when printk_safe_flush_on_panic() is called. > > > > When we hold the read_lock and then interrupted by the NMI, if the NMI > > handler call nmi_panic(), it is also can lead to deadlock. > > > > In order to fix it, we make read_lock global and rename it to > > safe_read_lock. And we handle safe_read_lock the same way in > > printk_safe_flush_on_panic() as we handle logbuf_lock there. > > What about the following commit message? It uses imperative language > and explains that the patch just prevents the deadlock. It removes > some details. The diagram is better than many words. > > <commit message> > printk_safe_flush_on_panic() caused the following deadlock on our server: > > CPU0: CPU1: > panic rcu_dump_cpu_stacks > kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace > register_nmi_handler(crash_nmi_callback) printk_safe_flush > __printk_safe_flush > > raw_spin_lock_irqsave(&read_lock) > // send NMI to other processors > apic_send_IPI_allbutself(NMI_VECTOR) > // NMI interrupt, > dead loop > crash_nmi_callback > printk_safe_flush_on_panic > printk_safe_flush > __printk_safe_flush > // deadlock > raw_spin_lock_irqsave(&read_lock) > > DEADLOCK: read_lock is taken on CPU1 and will never get released. > > It happens when panic() stops a CPU by NMI while it has been in > the middle of printk_safe_flush(). > > Handle the lock the same way as logbuf_lock. The printk_safe buffers > are flushed only when both locks can be safely taken. > > Note: It would actually be safe to re-init the locks when all CPUs were > stopped by NMI. But it would require passing this information > from arch-specific code. It is not worth the complexity. > Especially because logbuf_lock and printk_safe buffers have been > obsoleted by the lockless ring buffer. > </commit message>
Many thanks. It is clear. > > > Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic") > > Signed-off-by: Muchun Song <songmuc...@bytedance.com> > > With an updated commit message: > > Reviewed-by: Petr Mladek <pmla...@suse.com> Thanks. I will update the commit message. > > Best Regards, > Petr