Also, you can probably 'c'ontinue at that point. -- Sent from a phone, apologies for poor formatting.
On 15 September 2021 15:23:50 Martin Pieuchot <[email protected]> wrote:
On 15/09/21(Wed) 12:06, Paul de Weerd wrote:Hi all, After some off-list advice from Patrick to enable MP_LOCKDEBUG in order to debug the hangs I reported [1], I did exactly that and was running a self-built kernel for some time. This morning, I wanted to upgrade to the latest snapshot so I also cvs up'd and rebuilt my kernel with MP_LOCKDEBUG. However, now I get __mp_lock_spin during boot: root on sd2a (a0b80508b6693ba1.a) swap on sd2b dump on sd2b inteldrm0: 1920x1080, 32bpp wsdisplay0 at inteldrm0 mux 1 __mp_lock_spin: 0xffffffff822d1120 lock spun out Stopped at db_enter+0x10: popq %rbp ddb{1}> trace db_enter() at db_enter+0x10 __mp_lock(ffffffff822d1120) at __mp_lock+0xa2 __mp_acquire_count(ffffffff822d1120,1) at __mp_acquire_count+0x38 mi_switch() at mi_switch+0x299 sleep_finish(ffff8000226d4f80,1) at sleep_finish+0x11c msleep(ffff80000011d980,ffff80000011d998,20,ffffffff81e828e3,0) at msleep+0xcc taskq_next_work(ffff80000011d980,ffff8000226d5040) at taskq_next_work+0x61 taskq_thread(ffff80000011d980) at taskq_thread+0x6c end trace frame: 0x0, count: -8That means another CPU is holding the KERNEL_LOCK() for too long. When this happens it is more important to look at what other CPUs are doing because one of them is holding the KERNEL_LOCK(). If you can reproduce this, please include the output of "ps /o" and the trace from all the CPUs. Note that the default value of MP_LOCKDEBUG might be too sensitive for some workloads, using WITNESS might not spot the same issue, but does not present false positive. Thanks, Martin
