Hi Richard, On Sun, Dec 06, 2020 at 11:59:16PM +0100, Richard Weinberger wrote: > Hi! > > With both KMEMLEAK and KASAN enabled, I'm facing the following lockdep > splat at random times on Linus' tree as of today. > Sometimes it happens at bootup, sometimes much later when userspace has > started. > > Does this ring a bell? > > [ 2.298447] ============================= > [ 2.298971] [ BUG: Invalid wait context ] > [ 2.298971] 5.10.0-rc6+ #388 Not tainted > [ 2.298971] ----------------------------- > [ 2.298971] ksoftirqd/1/15 is trying to lock: > [ 2.298971] ffff888100b94598 (&n->list_lock){....}-{3:3}, at: > free_debug_processing+0x3d/0x210
I guest you also had CONFIG_PROVE_RAW_LOCK_NESTING=y, right? With that config, the wait context detetion of lockdep will treat spinlock_t as sleepable locks (considering PREEMPT_RT kernel), and here it complained about trying to acquire a sleepable lock (in PREEMPT_RT kernel) inside a irq context which cannot be threaded (in this case, it's the IPI). A proper fix will be modifying kmem_cache_node->list_lock to raw_spinlock_t. Regards, Boqun > [ 2.298971] other info that might help us debug this: > [ 2.298971] context-{2:2} > [ 2.298971] 1 lock held by ksoftirqd/1/15: > [ 2.298971] #0: ffffffff835f4140 (rcu_callback){....}-{0:0}, at: > rcu_core+0x408/0x1040 > [ 2.298971] stack backtrace: > [ 2.298971] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted 5.10.0-rc6+ #388 > [ 2.298971] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS rel-1.12.0-0-ga698c89-rebuilt.opensuse.org 04/01/2014 > [ 2.298971] Call Trace: > [ 2.298971] <IRQ> > [ 2.298971] dump_stack+0x9a/0xcc > [ 2.298971] __lock_acquire.cold+0xce/0x34b > [ 2.298971] ? lockdep_hardirqs_on_prepare+0x1f0/0x1f0 > [ 2.298971] ? rcu_read_lock_sched_held+0x9c/0xd0 > [ 2.298971] lock_acquire+0x153/0x4c0 > [ 2.298971] ? free_debug_processing+0x3d/0x210 > [ 2.298971] ? lock_release+0x690/0x690 > [ 2.298971] ? rcu_read_lock_bh_held+0xb0/0xb0 > [ 2.298971] ? pvclock_clocksource_read+0xd9/0x1a0 > [ 2.298971] _raw_spin_lock_irqsave+0x3b/0x80 > [ 2.298971] ? free_debug_processing+0x3d/0x210 > [ 2.298971] ? qlist_free_all+0x35/0xd0 > [ 2.298971] free_debug_processing+0x3d/0x210 > [ 2.298971] __slab_free+0x286/0x490 > [ 2.298971] ? lockdep_enabled+0x39/0x50 > [ 2.298971] ? rcu_read_lock_sched_held+0x9c/0xd0 > [ 2.298971] ? run_posix_cpu_timers+0x256/0x2c0 > [ 2.298971] ? rcu_read_lock_bh_held+0xb0/0xb0 > [ 2.298971] ? posix_cpu_timers_exit_group+0x30/0x30 > [ 2.298971] qlist_free_all+0x59/0xd0 > [ 2.298971] ? qlist_free_all+0xd0/0xd0 > [ 2.298971] per_cpu_remove_cache+0x47/0x50 > [ 2.298971] flush_smp_call_function_queue+0xea/0x2b0 > [ 2.298971] __sysvec_call_function+0x6c/0x250 > [ 2.298971] asm_call_irq_on_stack+0x12/0x20 > [ 2.298971] </IRQ> > [ 2.298971] sysvec_call_function+0x84/0xa0 > [ 2.298971] asm_sysvec_call_function+0x12/0x20 > [ 2.298971] RIP: 0010:__asan_load4+0x1d/0x80 > [ 2.298971] Code: 10 00 75 ee c3 0f 1f 84 00 00 00 00 00 4c 8b 04 > 24 48 83 ff fb 77 4d 48 b8 ff ff ff ff ff 7f ff ff 48 39 c7 76 3e 48 > 8d 47 03 <48> 89 c2 83 e2 07 48 83 fa 02 76 17 48 b9 00 00 00 00 00 fc > ff df > [ 2.298971] RSP: 0000:ffff888100e4f858 EFLAGS: 00000216 > [ 2.298971] RAX: ffffffff83c55773 RBX: ffffffff81002431 RCX: > dffffc0000000000 > [ 2.298971] RDX: 0000000000000001 RSI: ffffffff83ee8d78 RDI: > ffffffff83c55770 > [ 2.298971] RBP: ffffffff83c5576c R08: ffffffff81083433 R09: > fffffbfff07e333d > [ 2.298971] R10: 000000000001803d R11: fffffbfff07e333c R12: > ffffffff83c5575c > [ 2.298971] R13: ffffffff83c55774 R14: ffffffff83c55770 R15: > ffffffff83c55770 > [ 2.298971] ? ret_from_fork+0x21/0x30 > [ 2.298971] ? __orc_find+0x63/0xc0 > [ 2.298971] ? stack_access_ok+0x35/0x90 > [ 2.298971] __orc_find+0x63/0xc0 > [ 2.298971] unwind_next_frame+0x1ee/0xbd0 > [ 2.298971] ? ret_from_fork+0x22/0x30 > [ 2.298971] ? ret_from_fork+0x21/0x30 > [ 2.298971] ? deref_stack_reg+0x40/0x40 > [ 2.298971] ? __unwind_start+0x2e8/0x370 > [ 2.298971] ? create_prof_cpu_mask+0x20/0x20 > [ 2.298971] arch_stack_walk+0x83/0xf0 > [ 2.298971] ? ret_from_fork+0x22/0x30 > [ 2.298971] ? rcu_core+0x488/0x1040 > [ 2.298971] stack_trace_save+0x8c/0xc0 > [ 2.298971] ? stack_trace_consume_entry+0x80/0x80 > [ 2.298971] ? sched_clock_local+0x99/0xc0 > [ 2.298971] kasan_save_stack+0x1b/0x40 > [ 2.298971] ? kasan_save_stack+0x1b/0x40 > [ 2.298971] ? kasan_set_track+0x1c/0x30 > [ 2.298971] ? kasan_set_free_info+0x1b/0x30 > [ 2.298971] ? __kasan_slab_free+0x10f/0x150 > [ 2.298971] ? kmem_cache_free+0xa8/0x350 > [ 2.298971] ? rcu_core+0x488/0x1040 > [ 2.298971] ? __do_softirq+0x101/0x573 > [ 2.298971] ? run_ksoftirqd+0x21/0x50 > [ 2.298971] ? smpboot_thread_fn+0x1fc/0x380 > [ 2.298971] ? kthread+0x1c7/0x220 > [ 2.298971] ? ret_from_fork+0x22/0x30 > [ 2.298971] ? lockdep_hardirqs_on_prepare+0x1f0/0x1f0 > [ 2.298971] ? rcu_read_lock_sched_held+0x9c/0xd0 > [ 2.298971] ? lock_acquire+0x153/0x4c0 > [ 2.298971] ? rcu_core+0x408/0x1040 > [ 2.298971] ? lock_release+0x690/0x690 > [ 2.298971] ? lockdep_enabled+0x39/0x50 > [ 2.298971] ? mark_held_locks+0x49/0x90 > [ 2.298971] kasan_set_track+0x1c/0x30 > [ 2.298971] kasan_set_free_info+0x1b/0x30 > [ 2.298971] __kasan_slab_free+0x10f/0x150 > [ 2.298971] ? rcu_core+0x488/0x1040 > [ 2.298971] kmem_cache_free+0xa8/0x350 > [ 2.298971] ? __ia32_compat_sys_move_pages+0x130/0x130 > [ 2.298971] rcu_core+0x488/0x1040 > [ 2.298971] ? call_rcu+0x5d0/0x5d0 > [ 2.298971] ? rcu_read_lock_sched_held+0x9c/0xd0 > [ 2.298971] ? rcu_read_lock_bh_held+0xb0/0xb0 > [ 2.298971] __do_softirq+0x101/0x573 > [ 2.298971] ? trace_event_raw_event_irq_handler_exit+0x150/0x150 > [ 2.298971] run_ksoftirqd+0x21/0x50 > [ 2.298971] smpboot_thread_fn+0x1fc/0x380 > [ 2.298971] ? smpboot_register_percpu_thread+0x180/0x180 > [ 2.298971] ? __kthread_parkme+0xbb/0xd0 > [ 2.298971] ? smpboot_register_percpu_thread+0x180/0x180 > [ 2.298971] kthread+0x1c7/0x220 > [ 2.298971] ? kthread_create_on_node+0xd0/0xd0 > [ 2.298971] ret_from_fork+0x22/0x30 > > -- > Thanks, > //richard