On Sat 2016-08-20 14:24:30, Sergey Senozhatsky wrote: > On (08/19/16 21:00), Jan Kara wrote: > > > > depending on .config BUG() may never return back -- passing control > > > > to do_exit(), so printk_deferred_exit() won't be executed. thus we > > > > probably need to have a per-cpu variable that would indicate that > > > > we are in deferred_bug. hm... but do we really need deferred BUG() > > > > in the first place? > > > > > > Good question. I am not aware of any BUG_ON() that would be called from > > > wake_up_process() but it is hard to check everything. > > > > > > A conservative approach would be to force synchronous printk from > > > BUG_ON(). > > > > Just a quick thought: Cannot we just do printk_deferred_enter() when we are > > about to call into the scheduler from printk code and printk_deferred_exit() > > when leaving it? That would look like the least error-prone way how > > handling this kind of recursion... > > interesting idea. > printk_deferred_enter() increments preempt count, so there may be additional > obstacles and, as a result, ad-hocs, that scheduler people will sincerely > hate. > need to think more.
I wonder if this would be acceptable at least for wake_up_process(). It seems to be the only scheduler function that we are interested in. And we might call it from vprintk > > OTOH there's also the other possible direction for the recursion when we > > are in the scheduler, holding some scheduler locks, decide to WARN which > > enters printk, that ends up calling wake_up_process() which deadlocks > > on scheduler locks... I don't see how to handle this type of recursion > > inside the printk code itself easily and so far the answer was - use > > printk_deferred() in the scheduler and don't use WARN... > > the recursion detection is really tricky, yes. it seems (and I haven't > thought of it good enough) to be a bit simpler when we operate in async > printk mode, because we remove this uncontrollable console_unlock(). > so we can do something like this: > > vprintk_emit(....) > { > local_irq_save(); > > if (this_cpu_read(in_printk)) { > log_store(BUG: printk recursion!"); > goto out; > } This does not quarantee that we have the logbug_lock. We might endup here from the raw_spin_lock() call and the lock might be owned by another CPU. I am afraid that we could only set some global variable here. > > this_cpu_write(in_printk) = 1; > > raw_spin_lock(&logbuf_lock); > log_store(); > raw_spin_unlock(&logbuf_lock); > > if (!in_sched) { > if (console_loglevel != CONSOLE_LOGLEVEL_MOTORMOUTH && > can_printk_async()) { > printk_kthread_need_flush_console = true; > wake_up_process(printk_kthread); > } > } > > this_cpu_write(in_printk) = 0; > out: > local_irq_restore(); > } > > async printk mode from this point of view is sort of atomic. This would prevent using printk_deferred() from the scheduler code. A solution would be to set the per-CPU variable only around the wake_up_process() call. Well, it is orthogonal to using printk_deferred_enter() around calling wake_up_process(). Best Regards, Petr