‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Sunday, January 13, 2019 9:33 PM, Qian Cai <c...@lca.pw> wrote:
> On 1/13/19 9:20 PM, David Lechner wrote: > > > On 1/11/19 8:58 PM, Michel Lespinasse wrote: > > > > > On Fri, Jan 11, 2019 at 3:47 PM David Lechner da...@lechnology.com wrote: > > > > > > > On 1/11/19 2:58 PM, Qian Cai wrote: > > > > > > > > > A GPF was reported, > > > > > kasan: CONFIG_KASAN_INLINE enabled > > > > > kasan: GPF could be caused by NULL-ptr deref or user memory access > > > > > general protection fault: 0000 [#1] SMP KASAN > > > > > kasan_die_handler.cold.22+0x11/0x31 > > > > > notifier_call_chain+0x17b/0x390 > > > > > atomic_notifier_call_chain+0xa7/0x1b0 > > > > > notify_die+0x1be/0x2e0 > > > > > do_general_protection+0x13e/0x330 > > > > > general_protection+0x1e/0x30 > > > > > rb_insert_color+0x189/0x1480 > > > > > create_object+0x785/0xca0 > > > > > kmemleak_alloc+0x2f/0x50 > > > > > kmem_cache_alloc+0x1b9/0x3c0 > > > > > getname_flags+0xdb/0x5d0 > > > > > getname+0x1e/0x20 > > > > > do_sys_open+0x3a1/0x7d0 > > > > > __x64_sys_open+0x7e/0xc0 > > > > > do_syscall_64+0x1b3/0x820 > > > > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > > > > It turned out, > > > > > gparent = rb_red_parent(parent); > > > > > tmp = gparent->rb_right; <-- GPF was triggered here. > > > > > Apparently, "gparent" is NULL which indicates "parent" is rbtree's > > > > > root > > > > > which is red. Otherwise, it will be treated properly a few lines > > > > > above. > > > > > /* > > > > > * If there is a black parent, we are done. > > > > > * Otherwise, take some corrective action as, > > > > > * per 4), we don't want a red root or two > > > > > * consecutive red nodes. > > > > > */ > > > > > if(rb_is_black(parent)) > > > > > break; > > > > > Hence, it violates the rule #1 (the root can't be red) and need a fix > > > > > up, and also add a regression test for it. This looks like was > > > > > introduced by 6d58452dc06 where it no longer always paint the root as > > > > > black. > > > > > > > > > > Fixes: 6d58452dc06 (rbtree: adjust root color in rb_insert_color() > > > > > only > > > > > when necessary) > > > > > Reported-by: Esme espl...@protonmail.ch > > > > > Tested-by: Joey Pabalinas joeypabali...@gmail.com > > > > > Signed-off-by: Qian Cai c...@lca.pw > > > > > > > > > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > > > > > Tested-by: David Lechner da...@lechnology.com > > > > FWIW, this fixed the following crash for me: > > > > Unable to handle kernel NULL pointer dereference at virtual address > > > > 00000004 > > > > > > Just to clarify, do you have a way to reproduce this crash without the > > > fix ? > > > > I am starting to suspect that my crash was caused by some new code > > in the drm-misc-next tree that might be causing a memory corruption. > > It threw me off that the stack trace didn't contain anything related > > to drm. > > See: https://patchwork.freedesktop.org/patch/276719/ > > It may be useful for those who could reproduce this issue to turn on those > memory corruption debug options to narrow down a bit. > > CONFIG_DEBUG_PAGEALLOC=y > CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y > CONFIG_KASAN=y > CONFIG_KASAN_GENERIC=y > CONFIG_SLUB_DEBUG_ON=y I have been on SLAB, I configured SLAB DEBUG with a fresh pull from github. Linux syzkaller 5.0.0-rc2 #9 SMP Sun Jan 13 21:57:40 EST 2019 x86_64 ... In an effort to get a different stack into the kernel, I felt that nothing works better than fork bomb? :) Let me know if that helps. root@syzkaller:~# gcc -o test3 test3.c root@syzkaller:~# while : ; do ./test3 & done [1] 5671 [2] 5672 [3] 5673 [4] 5675 [5] 5677 [6] 5693 [7] 5699 [8] 5701 [9] 5741 [ 128.063843] INFO: trying to register non-static key. [ 128.064903] the code is fine but needs lockdep annotation. [ 128.066010] turning off the locking correctness validator. [ 128.067120] CPU: 0 PID: 5719 Comm: modprobe Not tainted 5.0.0-rc2 #9 [ 128.068420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1ubuntu1 04/01/2014 [ 128.070236] Call Trace: [ 128.070763] dump_stack+0x104/0x174 [ 128.071467] register_lock_class+0x598/0x5a0 [ 128.072326] __lock_acquire+0x84/0x16d0 [ 128.073090] ? find_held_lock+0x35/0xa0 [ 128.073876] lock_acquire+0xe7/0x200 [ 128.074599] ? acct_collect+0xd9/0x250 [ 128.075352] _raw_spin_lock_irq+0x49/0x60 [ 128.076165] ? acct_collect+0xd9/0x250 [ 128.076931] acct_collect+0xd9/0x250 [ 128.077687] do_exit+0x430/0x1370 [ 128.078373] ? task_work_run+0xb1/0x110 [ 128.079158] do_group_exit+0x79/0x130 [ 128.079904] __x64_sys_exit_group+0x1c/0x20 [ 128.080751] do_syscall_64+0x99/0x2f0 [ 128.081493] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 128.082533] RIP: 0033:0x7f7f37cc7618 [ 128.083317] Code: 00 00 be 3c 00 00 00 eb 19 66 0f 1f 84 00 00 00 00 00 48 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 21 f4 48 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb [ 128.087116] RSP: 002b:00007ffe905975c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 128.088634] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f7f37cc7618 [ 128.090035] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001 [ 128.091410] RBP: 00007f7f37fa48e0 R08: 00000000000000e7 R09: ffffffffffffff98 [ 128.092866] R10: 00007ffe90597548 R11: 0000000000000246 R12: 00007f7f37fa48e0 [ 128.094386] R13: 00007f7f37fa9c20 R14: 0000000000000000 R15: 0000000000000000 [ 128.130418] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 128.132110] #PF error: [normal kernel read fault] [ 128.133066] PGD 0 P4D 0 [ 128.133644] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 128.134575] CPU: 0 PID: 5756 Comm: kworker/u4:6 Not tainted 5.0.0-rc2 #9 [ 128.135922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1ubuntu1 04/01/2014 [ 128.137706] RIP: 0010:rb_insert_color+0x18/0x150 [ 128.138625] Code: fd c7 43 44 00 00 00 00 e9 3b ff ff ff 90 90 90 90 90 48 8b 07 48 85 c0 0f 84 38 01 00 00 48 8b 10 f6 c2 01 0f 85 34 01 00 00 <48> 8b 4a 08 49 89 d0 48 39 c1 74 4b 48 85 cc [ 128.142347] RSP: 0018:ffffc90001143a68 EFLAGS: 00010046 [ 128.143448] RAX: ffff8880607e28a8 RBX: 0000000000000000 RCX: 0000000000000000 [ 128.144884] RDX: 0000000000000000 RSI: ffffffff865eb010 RDI: ffff88805baa09e8 [ 128.146427] RBP: ffffc90001143ab8 R08: 0000000000000001 R09: 0000000000000001 [ 128.147889] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000282 [ 128.149375] R13: ffff88805baa09c8 R14: ffff88805baa0988 R15: ffffffff84ee2f50 [ 128.150815] FS: 0000000000000000(0000) GS:ffff88807f800000(0000) knlGS:0000000000000000 [ 128.152424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.153638] CR2: 0000000000000008 CR3: 000000006026a000 CR4: 00000000000006f0 [ 128.155026] Call Trace: [ 128.155536] ? create_object+0x22d/0x2c0 [ 128.156324] kmemleak_alloc+0x2f/0x50 [ 128.157062] kmem_cache_alloc+0x1b8/0x3d0 [ 128.157865] ? __anon_vma_prepare+0x113/0x1e0 [ 128.158738] __anon_vma_prepare+0x113/0x1e0 [ 128.159559] ? __pte_alloc+0x11e/0x1e0 [ 128.160300] __handle_mm_fault+0x1f8f/0x21d0 [ 128.161162] ? touch_atime+0x5f/0x140 [ 128.161917] handle_mm_fault+0x306/0x5d0 [ 128.162719] ? handle_mm_fault+0x48/0x5d0 [ 128.163598] __get_user_pages+0x53c/0xfa0 [ 128.164498] get_user_pages_remote+0x1e8/0x350 [ 128.165525] copy_strings.isra.28+0x288/0x530 [ 128.166485] copy_strings_kernel+0x56/0x80 [ 128.167335] __do_execve_file.isra.37+0x88e/0x1020 [ 128.168316] ? __do_execve_file.isra.37+0x223/0x1020 [ 128.169341] do_execve+0x4a/0x60 [ 128.170030] call_usermodehelper_exec_async+0x1b8/0x200 [ 128.171060] ? umh_complete+0x80/0x80 [ 128.171852] ret_from_fork+0x24/0x30 [ 128.172579] Modules linked in: [ 128.173296] CR2: 0000000000000008 [ 128.174000] ---[ end trace 5243d337fc3ae408 ]--- [ 128.174952] RIP: 0010:rb_insert_color+0x18/0x150 [ 128.175899] Code: fd c7 43 44 00 00 00 00 e9 3b ff ff ff 90 90 90 90 90 48 8b 07 48 85 c0 0f 84 38 01 00 00 48 8b 10 f6 c2 01 0f 85 34 01 00 00 <48> 8b 4a 08 49 89 d0 48 39 c1 74 4b 48 85 c9 [ 128.179890] RSP: 0018:ffffc90001143a68 EFLAGS: 00010046 [ 128.180957] RAX: ffff8880607e28a8 RBX: 0000000000000000 RCX: 0000000000000000 [ 128.182400] RDX: 0000000000000000 RSI: ffffffff865eb010 RDI: ffff88805baa09e8 [ 128.183917] RBP: ffffc90001143ab8 R08: 0000000000000001 R09: 0000000000000001 [ 128.185373] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000282 [ 128.186822] R13: ffff88805baa09c8 R14: ffff88805baa0988 R15: ffffffff84ee2f50 [ 128.188247] FS: 0000000000000000(0000) GS:ffff88807f800000(0000) knlGS:0000000000000000 [ 128.189875] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.191024] CR2: 0000000000000008 CR3: 000000006026a000 CR4: 00000000000006f0 [ 128.192455] Kernel panic - not syncing: Fatal exception [ 129.266473] Shutting down cpus with NMI [ 129.272005] Kernel Offset: disabled [ 129.272732] Rebooting in 86400 seconds..