On (01/19/18 13:20), Steven Rostedt wrote:
[..]
> I was thinking about this a bit more, and instead of offloading a
> recursive printk, perhaps its best to simply throttle it. Because the
> problem may not go away if a printk thread takes over, because the bug
> is really the printk infrastructure filling the printk buffer keeping
> printk from ever stopping.

right. I didn't quite got it how that would help. if we would
kick_offload every time we add new printks after call_console_drivers(),
then we can just end up in a kick_offload loop traveling across all CPUs.

[..]
>  asmlinkage int vprintk_emit(int facility, int level,
>                           const char *dict, size_t dictlen,
> @@ -1849,6 +1918,17 @@ asmlinkage int vprintk_emit(int facility, int level,
>  
>       /* This stops the holder of console_sem just where we want him */
>       logbuf_lock_irqsave(flags);
> +
> +     if (recursion_check_test()) {
> +             /* A printk happened within a printk at the same context */
> +             if (this_cpu_inc_return(recursion_count) > recursion_max) {
> +                     atomic_inc(&recursion_overflow);
> +                     logbuf_unlock_irqrestore(flags);
> +                     printed_len = 0;
> +                     goto out;
> +             }
> +     }

didn't have time to look at this carefully, but is this possible?

printks from console_unlock()->call_console_drivers() are redirected
to printk_safe buffer. we need irq_work on that CPU to flush its
printk_safe buffer.

>  EXPORT_SYMBOL(vprintk_emit);
> @@ -2343,9 +2428,14 @@ void console_unlock(void)
>                       seen_seq = log_next_seq;
>               }
>  
> -             if (console_seq < log_first_seq) {
> +             if (console_seq < log_first_seq || 
> atomic_read(&recursion_overflow)) {
> +                     size_t missed;
> +
> +                     missed = atomic_xchg(&recursion_overflow, 0);
> +                     missed += log_first_seq - console_seq;
> +
>                       len = sprintf(text, "** %u printk messages dropped 
> **\n",
> -                                   (unsigned)(log_first_seq - console_seq));
> +                                   (unsigned)missed);
>  
>                       /* messages are gone, move to first one */
>                       console_seq = log_first_seq;

how are we going to distinguish between lockdep splats, for instance,
or WARNs from call_console_drivers() -> foo_write(), which are valuable,
and kmalloc() print outs, which might be less valuable? are we going to
lose all of them now? then we can do a much simpler thing - steal one
bit from `printk_context' and use if for a new PRINTK_NOOP_CONTEXT, which
will be set around call_console_drivers(). vprintk_func() would redirect
printks to vprintk_noop(fmt, args), which will do nothing.

        -ss

Reply via email to