On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote: > Hello! > > I get the splat below at a rate of roughly two per thirty hours when > running rcutorture scenario TREE03 on x86 at the June 3rd mainline commit: > > cb8e59cc8720 ("Merge > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next") > > Running 140 hours of this same scenario at the following June 2nd mainline > commit shows no errors: > > d9afbb350990 ("Merge branch 'next-general' of > git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security") > > I have started a bisection, but it is likely to take several days to > complete. I am looking at ways of speeding this up, but in the meantime, > I figured that I should check to see if others are also encountering this. > > Thoughts?
I think this shows there's a boo-boo with the IPI patches. I've not managed to reproduce, but I'll give them another hard look. Would you have a .config for me? My compiler's check_preempt_wakeup isn't anywhere near 0x180 bytes long. I'm thiknig you have instrumentation enabled, KCSAN? > BUG: kernel NULL pointer dereference, address: 0000000000000150 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 > 04/01/2014 > RIP: 0010:check_preempt_wakeup+0xb1/0x180 > Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 > 85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 > 48 39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94 > RSP: 0018:ffffaccdc02ecd38 EFLAGS: 00010006 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffafa0bc20 > RDX: 0000000000000000 RSI: ffff946b5df50000 RDI: ffff946b5f469340 > RBP: 0000000000000000 R08: ffff946b5df80d00 R09: 0000000000000001 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff946b5f469300 > R13: 0000000000000001 R14: ffff946b5df80d00 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff946b5f440000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000150 CR3: 0000000016e0a000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > check_preempt_curr+0x5d/0x90 > ttwu_do_wakeup.isra.93+0xf/0x100 > sched_ttwu_pending+0x66/0x90 > smp_call_function_single_interrupt+0x2d/0xf0 > call_function_single_interrupt+0xf/0x20 Right, so I frobbed at that recently, see: a148866489fbe243c936fe43e4525d8dbfa0318f...19a1f5ec699954d21be10f74ff71c2a7079e99ad