On Tue, Mar 28, 2017 at 05:07:09PM +0800, Fengguang Wu wrote: > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > commit 857811a37129f5d2ba162d7be3986eff44724014 > Author: Boqun Feng <boqun.f...@gmail.com> > AuthorDate: Wed Mar 1 23:01:38 2017 +0800 > Commit: Ingo Molnar <mi...@kernel.org> > CommitDate: Thu Mar 2 09:00:39 2017 +0100 > > locking/ww_mutex: Adjust the lock number for stress test > > Because there are only 12 bits in held_lock::references, so we only > support 4095 nested lock held in the same time, adjust the lock number > for ww_mutex stress test to kill one lockdep splat: > > [ ] [ BUG: bad unlock balance detected! ] > [ ] kworker/u2:0/5 is trying to release lock (ww_class_mutex) at: > [ ] ww_mutex_unlock() > [ ] but there are no more locks to release! > ... > > Signed-off-by: Boqun Feng <boqun.f...@gmail.com> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > Cc: Andrew Morton <a...@linux-foundation.org> > Cc: Chris Wilson <ch...@chris-wilson.co.uk> > Cc: Fengguang Wu <fengguang...@intel.com> > Cc: Linus Torvalds <torva...@linux-foundation.org> > Cc: Nicolai Hähnle <nicolai.haeh...@amd.com> > Cc: Paul E. McKenney <paul...@linux.vnet.ibm.com> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Thomas Gleixner <t...@linutronix.de> > Link: > http://lkml.kernel.org/r/20170301150138.hdixnmafzfsox...@tardis.cn.ibm.com > Signed-off-by: Ingo Molnar <mi...@kernel.org> > > 7fb4a2cea6 locking/lockdep: Add nest_lock integrity test > 857811a371 locking/ww_mutex: Adjust the lock number for stress test > c02ed2e75e Linux 4.11-rc4 > 7f0c4a163a Add linux-next specific files for 20170327 > +-----------------------------------------------------+------------+------------+-----------+---------------+ > | | 7fb4a2cea6 | > 857811a371 | v4.11-rc4 | next-20170327 | > +-----------------------------------------------------+------------+------------+-----------+---------------+ > | boot_successes | 0 | 16 > | 29 | 1 | > | boot_failures | 221 | 42 > | 64 | 9 | > | WARNING:at_kernel/locking/lockdep.c:#__lock_acquire | 221 | > | | | > | BUG:kernel_hang_in_boot_stage | 0 | 42 > | 60 | 9 | > | BUG:kernel_hang_in_test_stage | 0 | 0 > | 4 | | > +-----------------------------------------------------+------------+------------+-----------+---------------+ > > [ 319.426004] CPU 0 is now offline > [ 319.426004] CPU 0 is now offline > [ 319.427670] > <kernel hangs here> >
Hmm.. with the same reproduce script, I'm able to reproduce the hang even with CONIFG_WW_MUTEX_SELFTEST=N. Fengguang, could you try to verify whether ww_mutex test is related to this hang? Regards, Boqun
signature.asc
Description: PGP signature