On Fri, Jan 29, 2016 at 09:43:37PM +0900, Byungchul Park wrote: > changes from v4 to v5 > - found out a clear scenario which make a system crazy. at least it > should not be caused by the debug code.
Hello, Andrew Please take this v5 patch instead of v2 patch, which you took. Or give your opinion. Thanks, Byungchul > > changes from v3 to v4 > - reuse a existing code as much as possible for preventing an infinite > recursive cycle. > > changes from v2 to v3 > - avoid printk() only in case of lockup suspected, not real lockup in > which case it does not help at all. > - consider not only console_sem.lock but also logbuf_lock which is used > by printk(). > > changes from v1 to v2 > - only change comment and commit message esp. replacing "deadlock" with > "infinite recursive cycle", since it is not an actual deadlock. > > thanks, > byungchul > > -----8<----- > >From eed077240e0b0d9f14d91037ef1915feab85aa4d Mon Sep 17 00:00:00 2001 > From: Byungchul Park <byungchul.p...@lge.com> > Date: Fri, 29 Jan 2016 21:23:24 +0900 > Subject: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the > debug code > > It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK, > in the spin_dump(). Backtrace prints printk() -> console_trylock() -> > do_raw_spin_lock() -> spin_dump() -> printk()... infinitely. > > When the spin_dump() is called from printk(), we should prevent the > debug spinlock code from calling printk() again in that context. It's > reasonable to avoid printing "lockup suspected" which is just a warning > message but it would cause a real lockup definitely. > > The scenario is, > > cpu0 > ==== > printk > console_trylock > console_unlock > up_console_sem > up > raw_spin_lock_irqsave(&sem->lock, flags) > __up > wake_up_process > try_to_wake_up > raw_spin_lock_irqsave(&p->pi_lock) > __spin_lock_debug > spin_dump <=== the problem point! > printk > console_trylock > raw_spin_lock_irqsave(&sem->lock, flags) > > <=== DEADLOCK > > cpu1 > ==== > printk > console_trylock > raw_spin_lock_irqsave(&sem->lock, flags) > __spin_lock_debug > spin_dump > printk > ... > > <=== repeat the recursive cycle infinitely > > Signed-off-by: Byungchul Park <byungchul.p...@lge.com> > --- > kernel/locking/spinlock_debug.c | 16 +++++++++++++--- > kernel/printk/printk.c | 5 +++++ > 2 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c > index 0374a59..cf7bc96 100644 > --- a/kernel/locking/spinlock_debug.c > +++ b/kernel/locking/spinlock_debug.c > @@ -103,6 +103,8 @@ static inline void debug_spin_unlock(raw_spinlock_t *lock) > lock->owner_cpu = -1; > } > > +extern int is_console_lock(raw_spinlock_t *lock); > + > static void __spin_lock_debug(raw_spinlock_t *lock) > { > u64 i; > @@ -113,11 +115,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock) > return; > __delay(1); > } > - /* lockup suspected: */ > - spin_dump(lock, "lockup suspected"); > + > + /* > + * If this function is called from printk(), then we should > + * not call printk() more. Or it will cause an infinite > + * recursive cycle! > + */ > + if (likely(!is_console_lock(lock))) { > + /* lockup suspected: */ > + spin_dump(lock, "lockup suspected"); > #ifdef CONFIG_SMP > - trigger_all_cpu_backtrace(); > + trigger_all_cpu_backtrace(); > #endif > + } > > /* > * The trylock above was causing a livelock. Give the lower level arch > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 2ce8826..568ab11 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1981,6 +1981,11 @@ asmlinkage __visible void early_printk(const char > *fmt, ...) > } > #endif > > +int is_console_lock(raw_spinlock_t *lock) > +{ > + return lock == &console_sem.lock; > +} > + > static int __add_preferred_console(char *name, int idx, char *options, > char *brl_options) > { > -- > 1.9.1