Linus Torvalds wrote: > On Tue, Aug 29, 2017 at 1:41 PM, Tetsuo Handa > <penguin-ker...@i-love.sakura.ne.jp> wrote: > >> > >> A private buffer has none of those issues. > > > > Yes, I posted "[PATCH] printk: Add best-effort printk() buffering." at > > http://lkml.kernel.org/r/1493560477-3016-1-git-send-email-penguin-ker...@i-love.sakura.ne.jp > > . > > No, this is exactly what I *don't* want, because it takes over printk() > itself. > > And that's problematic, because nesting happens for various reasons. > > For example, you try to handle that nesting with printk_context(), and > nothing when an interrupt happens. > > But that is fundamentally broken. > > Just to give an example: what if an interrupt happens, it does this > buffering thing, then it gets interrupted by *another* interrupt, and > now the printk from that other interrupt gets incorrectly nested > together with the first one, because your "printk_context()" gives > them the same context?
My assumption was that (1) task context can be preempted by soft IRQ context, hard IRQ context and NMI context. (2) soft IRQ context can be preempted by hard IRQ context and NMI context. (3) hard IRQ context can be preempted by NMI context. (4) An kernel-oops event can interrupt task context, soft IRQ context, hard IRQ context and NMI context, but the interrupted context can not continue execution of vprintk_default() after returning from the kernel-oops event even if the kernel-oops event occurred in schedulable context and panic_on_oops == 0. and thus my "printk_context()" gives them different context. But my assumption was wrong that soft IRQ context can be preempted by different soft IRQ context (e.g. SoftIRQ1 can be preempted by SoftIRQ2 while running handler for SoftIRQ1, and SoftIRQ2 can be preempted by SoftIRQ3 while running handler for SoftIRQ2, and so on) hard IRQ context can be preempted by different hard IRQ context (e.g. HardIRQ1 can be preempted by HardIRQ2 while running handler for HardIRQ1, and HardIRQ2 can be preempted by HardIRQ3 while running handler for HardIRQ2, and so on) ? Then, we need to recognize how many IRQs are nested. I just tried to distinguish context using one "unsigned long" value by embedding IRQ status into lower bits of "struct task_struct *". I can change to distinguish context using multiple "unsigned long" values. > > And it really doesn't have to even be interrupts. Look at what happens > if you take a page fault in kernel space. Same exact deal. Both are > sleeping contexts. Is merging messages from outside a page fault and inside a page fault so serious? That would happen only if memory access which might cause a page fault occurs between get_printk_buffer() and put_printk_buffer(), and I think that such user is rare. > > So I really think that the only thing that knows what the "context" is > is the person who does the printing. So if you want to create a > printing buffer, it should be explicit. You allocate the buffer ahead > of time (perhaps on the stack, possibly using actual allocations), and > you use that explicit context. If my assumption was wrong, isn't it dangerous from stack usage point of view that we try to call kmalloc() (or allocate from stack memory) for prbuf_init() for each nested level because it is theoretically possible that a different IRQ jumps in while kmalloc() is in progress (or stack memory is in use)? > > Yes, it means that you don't do "printk()". You do an actual > "buf_print()" or similar. > > Linus > My worry is that there are so many functions which will need to receive "struct seq_buf *" argument (from tail of __dump_stack() to head of out_of_memory(), including e.g. cpuset_print_current_mems_allowed()) that patches for passing "struct seq_buf *" argument becomes so large and difficult to synchronize. I tried to pass such argument to relevant functions before I propose "[PATCH] printk: Add best-effort printk() buffering." patch, but I came to conclusion that passing such argument is too complicated and too much bloat compared to merit. If we teach printk subsystem that "I want to use buffering" via get_printk_buffer(), we don't need to scatter around "struct seq_buf *" argument throughout the kernel. Using kmalloc() for prbuf_init() also introduces problems such as (a) we need to care about safe GFP flags (i.e. GFP_ATOMIC or GFP_KERNEL or something else which cannot be managed by current_gfp_context()) based on locking context (b) allocations can fail, and printing allocation failure messages when printing original messages is disturbing (c) allocation stall/failure messages are printed under memory pressure, but stack memory is not large enough to store messages related allocation stall/failure messages and thus I want to use "statically allocated buffer" like workqueue's rescuer kernel thread which can be used under memory pressure. Linus Torvalds wrote at http://lkml.kernel.org/r/CA+55aFxmL4ybpz19OPn97VYqAk2ZS-tf=0W2Ff1K=-uub6m...@mail.gmail.com : > On Fri, Sep 1, 2017 at 10:32 AM, Joe Perches <j...@perches.com> wrote: > > > > Yes, it's a poor name. At least keep using a pr_ prefix. > > I'd suggest perhaps just "pr_line()". > > And instead of having those "err/info/cont" variations, the severity > level should just be set at initialization time. Not different > versions of "pr_line()". > > There's no point to having different severity variations, since the > *only* reason for this would be for buffering. So "pr_cont()" is kind > of assumed for everything but the first. But it is annoying for me that Lines1-for-event1-with-loglevel-foo Lines2-for-event1-with-loglevel-bar Lines3-for-event1-with-loglevel-baz (like OOM killer messages) are all treated as loglevel foo breaks console_loglevel filtering and > > And even if you end up doing multiple lines, if you actually do > different severities, you damn well shouldn't buffer them together. > They are clearly different things! two series of messages Line1-for-event1-with-loglevel-foo Line2-for-event1-with-loglevel-bar Line3-for-event1-with-loglevel-bar Line4-for-event1-with-loglevel-bar Line5-for-event1-with-loglevel-baz by task/1000 and Line1-for-event2-with-loglevel-foo Line2-for-event2-with-loglevel-bar Line3-for-event2-with-loglevel-bar Line4-for-event2-with-loglevel-bar Line5-for-event2-with-loglevel-baz by task/1001 are mixed due to not to buffering lines with different loglevels causes unreadable logs (unless printk() automatically inserts context identifier into each line like foo task/1000 Line1-for-event1-with-loglevel-foo foo task/1001 Line1-for-event2-with-loglevel-foo bar task/1001 Line2-for-event2-with-loglevel-bar bar task/1001 Line3-for-event2-with-loglevel-bar bar task/1001 Line4-for-event2-with-loglevel-bar bar task/1000 Line2-for-event1-with-loglevel-bar bar task/1000 Line3-for-event1-with-loglevel-bar bar task/1000 Line4-for-event1-with-loglevel-bar baz task/1000 Line5-for-event1-with-loglevel-baz baz task/1001 Line5-for-event2-with-loglevel-baz ).