Re: [RFC][PATCH 0/4] printk: introduce printing kernel thread

Petr Mladek Fri, 24 Mar 2017 07:43:49 -0700

On Fri 2017-03-24 10:59:36, Sergey Senozhatsky wrote:
> On (03/23/17 09:51), Peter Zijlstra wrote:
> [..]
> > > > sysrq runs from interrupt context, right? Should be able to do wakeups.
> > > 
> > > what I though about was -
> > >   what if there are 'misbehaving' higher prio tasks all the time?
> > >   the existing sysrq would attempt to do printing from irq context
> > >   so it doesn't care about run queues.
> > > 
> > > does it make sense to you?
> > 
> > Ah, that's what you meant. Yeah, dunno, I'm still unconvinced about the
> > whole printk thread thing.
> 
> I see your point.
> but I can't think of alternatives that would fix all those lockups and
> stalls and at the same time have better guarantees than printk_kthread.
> 
> 
> > Also those function names are horrifically long.
> 
> right. not happy with the naming either.
> 
> so what I'm thinking about right now is:
> 
> we have that thing which we call "old printk" mode, which is not
> really informative. and my proposal is rename "old" mode and use
> "printk rescue" mode instead. because we switch to that mode when
> we are trying to "rescue" kernel logs. so the API can be something
> like
>               printk_rescue_on()
>               printk_rescue_off()


Sounds good to me. Slight problem is that off() does not cause
stopping the mode if we are nested.

Just one more attempt inspired by this:

                printk_emergency_begin()
                printk_emergency_end()

Note that we actually start this mode automatically also
with pr_emerg() message.

But I am fine with whatever from the mentioned generic names.

> 
> --- random thoughts ---
> 
> another thing that bothers me a bit is that we need to place those
> printk_rescue_on/printk_rescue_off switches all over the kernel.
> sort of a root cause [in some of the cases] here is the fact that
> we don't have any feedback from printk_kthread in vprintk_emit():
>       does printk_kthread make any progress?
>       do we flush messages to the serial console?
>       etc.
> 
> and we've got everything we need to have such a feedback in
> vprintk_emit():
> 
>       a) console is not suspended so console_unlock() can call console drivers
>       b) printk_kthread != NULL
>       c) we are not in enforced rescue/emergency mode
>       d) `log_next_seq' moves forward (always `true', we are in 
> vprintk_emit())
>       e) `console_seq' stands still
> 
> so we can have an automatic rescue mode fallback in vprintk_emit().
> if (a)-(e) are true then we give up on waking up printk_kthread,
> switch to rescue mode and attempt to console_trylock() directly from
> vprintk_emit(). the part that sucks here is that we need to give
> printk_kthread some time to catch up. for instance, if (e) is true
> for the past 50 invocations of vprintk_emit(), IOW:
> 
>       - we added 50 lines to printk
>       - none have been printed on the serial console
>
> then we
>       - declare rescue
>       - do console_trylock() instead of wake_up() //unless in deferred 
> vprintk_emit()

I am not sure if we are able to distinguish a flood of messages
from a real emergency situation.

If we start flushing messages directly when there is a flood
of messages, we will put back the original problem with soft
lookups.

Well, there is a handful of annotated locations at the moment.
I would start thinking of an automatic detection once we have
more of them and have more data for a good heuristic.

I still would like to see the kernel parameter/sysfs knob
that would allow to force the rescue/emergency mode all
the time ;-)

Best Regards,
Petr

Re: [RFC][PATCH 0/4] printk: introduce printing kernel thread

Reply via email to