On Fri 2018-06-01 13:40:50, Sergey Senozhatsky wrote:
> On (05/31/18 14:21), Petr Mladek wrote:
> > > 
> > > Upstream printk has no printing kthread. And we also run
> > > printk()->console_unlock() with disabled preemption.
> > 
> > Yes, the comment was wrong
> 
> Yes, that was the only thing I meant.
> I really didn't have any time to look at the patch yesterday, just
> commented on the most obvious thing.

Fair enough.

> > but the problem is real.
> 
> Yep, could be. But not exactly the way it is described in the commit
> messages and the patch does not fully address the problem.
> 
> The patch assumes that all those events happen sequentially. While
> in reality they can happen in parallel on different CPUs.
> 
> Example:
> 
>       CPU0                                    CPU1
> 
>       set console verbose
> 
>       dump_backtrace()
>       {
>               // for (;;)  print frames
>               printk("%pS\n", frame0);
>               printk("%pS\n", frame1);
>               printk("%pS\n", frame2);
>               printk("%pS\n", frame3);
>               ...                             console_loglevel = 
> CONSOLE_LOGLEVEL_SILENT;
>               printk("%pS\n", frame12);
>               printk("%pS\n", frame13);
>       }
> 
> Part of backtrace or the entire backtrace will be missed, because
> we read the global console_loglevel. The problem is still there.

[...]

> So I'd say that most likely the following scenarios can suffer:
> 
> - NMI comes in, sets loglevel to X, printk-s some data, restores the
>   loglevel back to Y
> - IRQ comes in [like sysrq, etc] comes in and does the same thing
> - software exception comes in and does the same thing [e.g. bust_spinlocks()
>   at arch/s390/mm/fault.c]


My view is:

The race with another printk() (console_lock owner) is much more
likely than a race between two CPUs manipulating console_loglevel.

The proposed patch seems to be in the right direction. It is supposed
to fix the most likely scenario. We could block it and request full
solution but I wonder if it is worth it.

I am personally fine with this partial solution for now. We could
always make it better if people meet the other scenarios.

Best Regards,
Petr

Reply via email to