On (12/15/17 14:06), Sergey Senozhatsky wrote:
[..]
> > Where do we do the above? And has this been proven to be an issue?
> 
> um... hundreds of cases.
> 
> deep-stack spin_lock_irqsave() lockup reports from multiple CPUs (3 cpus)
> happening at the same moment + NMI backtraces from all the CPUs (more
> than 3 cpus) that follows the lockups, over not-so-fast serial console.
> exactly the bug report I received two days ago. so which one of the CPUs
> here is a good candidate to successfully emit all of the pending logbuf
> entries? none. all of them either have local IRQs disabled, or dump_stack()
> from either backtrace IPI or backtrace NMI (depending on the configuration).


and, Steven, one more thing. wondering what's your opinion.


suppose we have consoe_owner hand off enabled, 1 non-atomic CPU doing
printk-s and several atomic CPUs doing printk-s. Is proposed hand off
scheme really useful in this case? CPUs will now

a) print their lines (a potentially slow call_console_drivers())

and

b) spin in vprintk_emit on console_owner with local IRQs disabled
   waiting for either non-atomic printk CPU or another atomic CPU
   to finish printing its line (call_console_drivers()) and to hand
   off printing. so current CPU, after busy-waiting for foreign CPU's
   call_console_drivers(), will go and do his own call_console_drivers().
   which, time-wise, simply doubles (roughly) the amount of time that
   CPU spends in printk()->console_unlock(). agreed?

   if we previously could have a case when non-atomic printk CPU would
   grab the console_sem and print all atomic printk CPUs messages first,
   and then its own messages, thus atomic printk CPUs would have just
   log_store(), now we will have CPUs to call_console_driver() and to
   spin on console_sem owner waiting for call_console_driver() on a foreign
   CPU  [not all of them: it's one CPU doing the print out and one CPU
   spinning console_owner. but overall I think all CPUs will experience
   that spin on console_sem waiting for call_console_driver() and then do
   its own call_console_driver()].


even two CPUs case is not so simple anymore. see below.

- first, assume one CPU is atomic and one is non-atomic.
- second, assume that both CPUs are atomic CPUs, and go thought it again.


CPU0                            CPU1

printk()                        printk()
 log_store()
                                 log_store()
 console_unlock()
  set console_owner
                                 sees console_owner
                                 sets console_waiter
                                 spin
  call_console_drivers()
  sees console_waiter
   break

printk()
 log_store()
                                 console_unlock()
                                  set console_owner
 sees console_owner
 sets console_waiter
 spin
                                 call_console_drivers()
                                 sees console_waiter
                                  break

                                printk()
                                 log_store()
 console_unlock()
  set console_owner
                                 sees console_owner
                                 sets console_waiter
                                 spin
  call_console_drivers()
  sees console_waiter
  break

printk()
 log_store()
                                 console_unlock()
                                  set console_owner
 sees console_owner
 sets console_waiter
 spin

....


that "wait for call_console_drivers() on another CPU and then do
its own call_console_drivers()" pattern does look dangerous. the
benefit of hand-off is really fragile sometimes, isn't it?

        -ss

Reply via email to