I've been chasing this bug which shows up occasionally. We've seen it happen all the way back to 3.10, and more recently, on 4.0 and the code affected doesn't look to have changed much since..
BUG: unable to handle kernel NULL pointer dereference at 0000000000000019 IP: [<ffffffff81442bc0>] vsnprintf+0x60/0x580 PGD 42c42a067 PUD 28a441067 PMD 0 Oops: 0000 [#1] SMP CPU: 13 PID: 2364235 Comm: ps Not tainted 4.0.9 #1 task: ffff880445f16880 ti: ffff8801ac52c000 task.ti: ffff8801ac52c000 RIP: 0010:[<ffffffff81442bc0>] [<ffffffff81442bc0>] vsnprintf+0x60/0x580 RSP: 0018:ffff8801ac52fc08 EFLAGS: 00010286 RAX: 0000000000000019 RBX: ffff88033a46720d RCX: ffffffff81981e31 RDX: 0000000000000025 RSI: ffff8801ac52fc40 RDI: ffffffff81981e18 RBP: ffff8801ac52fc78 R08: ffffffff81981e18 R09: 00000000ffffffff R10: 0000000000000000 R11: ffff8801ac52fa9e R12: ffff88033a468000 R13: ffff8801ac52fcb0 R14: 000000000001f199 R15: ffffffff81981e18 FS: 00007f907481a880(0000) GS:ffff88046fba0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000019 CR3: 0000000342931000 CR4: 00000000003406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000000000017 0000000000000001 0000000000001000 ffffffff81977b7f ffff88033a46720d 0000000000000df3 ffffffff811d7b39 0000000000000000 0000000000000296 ffff8801cffcc780 ffff8801cffcc780 ffff88046d7c7200 Call Trace: [<ffffffff811d7b39>] ? seq_vprintf+0x39/0x70 [<ffffffff811d7b35>] seq_vprintf+0x35/0x70 [<ffffffff811d7bad>] seq_printf+0x3d/0x40 [<ffffffff8121b675>] proc_pid_status+0x735/0x7c0 [<ffffffff811bdc32>] ? path_cleanup+0x42/0x60 [<ffffffff812168f4>] proc_single_show+0x54/0xa0 [<ffffffff811d746a>] seq_read+0xea/0x370 [<ffffffff811b4cc0>] __vfs_read+0x20/0x60 [<ffffffff811b4d86>] vfs_read+0x86/0x140 [<ffffffff811b4e86>] SyS_read+0x46/0xb0 [<ffffffff8175930e>] ? int_check_syscall_exit_work+0x34/0x3d [<ffffffff817590f2>] system_call_fastpath+0x12/0x17 Code: 89 cd 49 01 fc 0f 82 18 03 00 00 48 89 7d b0 41 0f b6 07 0f 1f 84 00 00 00 00 00 84 c0 74 43 48 8d 75 c8 4c 89 ff e8 30 d4 ff ff <0f> b6 55 c8 48 63 c8 4d 8d 34 0f 80 fa 07 0f 87 4c 02 00 00 ff RIP [<ffffffff81442bc0>] vsnprintf+0x60/0x580 RSP <ffff8801ac52fc08> My first thought on this was "ps is reading /proc/pid/status while the process is exiting/crashing" Looking at proc_pid_status(), I see a bunch of the functions it calls take task_lock(), but there are also a bunch of bare task struct dereferences. Wrapping those in task_lock()/task_unlock()'s might narrow the window some, but it still seems like there would be a small opportunity for this same scenario to exist. What puzzles me, is that I can't make this bug reproduce for the life of me, even by adding sleeps in proc_pid_status, so I'm wondering if I'm way off base. Is there some other locking at play here protecting the task struct that I'm overlooking ? Dave