Re: [PATCH v2] printk/nmi: avoid direct printk()-s from __printk_nmi_flush()

Sergey Senozhatsky Tue, 30 Aug 2016 18:45:02 -0700

Hello,

On (08/30/16 15:03), Andrew Morton wrote:
> > __printk_nmi_flush() can be called from nmi_panic(), therefore it has to
> > test whether it's executed in NMI context and thus must route the messages
> > through deferred printk() or via direct printk().
> 
> Why?  What misbehaviour does the current code cause?


the reasoning behind the `if in_nmi()' in print_nmi_seq_line()

       if (in_nmi())
               printk_deferred("%.*s", (end - start) + 1, buf);
       else
               printk("%.*s", (end - start) + 1, buf);

was as follows (per Petr's commit message)

:    In NMI context, printk() messages are stored into per-CPU buffers to
:    avoid a possible deadlock.  They are normally flushed to the main ring
:    buffer via an IRQ work.  But the work is never called when the system
:    calls panic() in the very same NMI handler.
:
:    This patch tries to flush NMI buffers before the crash dump is
:    generated.  In this case it does not risk a double release and bails out
:    when the logbuf_lock is already taken.  The aim is to get the messages
:    into the main ring buffer when possible.  It makes them better
:    accessible in the vmcore.
:
:    Then the patch tries to flush the buffers second time when other CPUs
:    are down.  It might be more aggressive and reset logbuf_lock.  The aim
:    is to get the messages available for the consequent kmsg_dump() and
:    console_flush_on_panic() calls.
:
:    The patch causes vprintk_emit() to be called even in NMI context again.
:    But it is done via printk_deferred() so that the console handling is
:    skipped.  Consoles use internal locks and we could not prevent a
:    deadlock easily.  They are explicitly called later when the crash dump
:    is not generated, see console_flush_on_panic().

doing pr_err() and pr_cont() defeats the purpose of print_nmi_seq_line(),
because in the worst case it can do something like this:

  vprintk_emit()
   /* console_trylock() */
   console_unlock()
    call_console_drivers()
     foo_write()   // possibly locked


the tricky moment here is -- if the console semaphore is unlocked and
NMI can successfully console_trylock(), then how any of underlying serial
console driver's locks can be taken? one possibility is:

        CPU0

SyS_ioctl
 do_vfs_ioctl
  tty_ioctl
   n_tty_ioctl
    tty_mode_ioctl
     set_termios
      tty_set_termios
       uart_set_termios
        uart_change_speed
         FOO_serial_set_termios
          spin_lock_irqsave(&port->lock)     // lock the output port
           ...
--> NMI
       nmi_panic()
        printk_nmi_flush_on_panic()
         __printk_nmi_flush()
          pr_cont()
           /* console_trylock() */
           console_unlock()
            call_console_drivers()
             foo_write()
              spin_lock_irqsave(&port->lock)  // already locked


as far as I can see, tty_ioctl() path does not lock console semaphore,
but grabs some tty locks instead.

        -ss

Re: [PATCH v2] printk/nmi: avoid direct printk()-s from __printk_nmi_flush()

Reply via email to