On Fri, Apr 15, 2016 at 9:49 AM, Dave Jones <da...@codemonkey.org.uk> wrote: > [<ffffffff811d7b39>] ? seq_vprintf+0x39/0x70 > [<ffffffff811d7b35>] seq_vprintf+0x35/0x70 > Code: 89 cd 49 01 fc 0f 82 18 03 00 00 48 89 7d b0 41 0f b6 07 0f 1f 84 00 00 > 00 00 00 84 c0 74 43 48 8d 75 c8 4c 89 ff e8 30 d4 ff ff <0f> b6 55 c8 48 63 > c8 4d 8d 34 0f 80 fa 07 0f 87 4c 02 00 00 ff
The code disassembles to 0: 48 89 7d b0 mov %rdi,-0x50(%rbp) 4: 41 0f b6 07 movzbl (%r15),%eax 8: 0f 1f 84 00 00 00 00 nopl 10: 84 c0 test %al,%al 12: 74 43 je 0x57 14: 48 8d 75 c8 lea -0x38(%rbp),%rsi 18: 4c 89 ff mov %r15,%rdi 1b: e8 30 d4 ff ff callq 0xffffffffffffd450 20:* 0f b6 55 c8 movzbl -0x38(%rbp),%edx <-- trapping instruction 24: 48 63 c8 movslq %eax,%rcx which is interesting. That "-0x38(%rbp)" was passed (by reference) to some subroutine, and now that we try to read the value, we take a fault. And it makes even less sense because %rbp really seems to be not a random register, but the frame pointer: RBP: ffff8801ac52fc78 RSP: ffff8801ac52fc08 So why the *hell* do we get BUG: unable to handle kernel NULL pointer dereference at 0000000000000019 for that? That makes no sense. Quite frankly, I would not attribute this to /proc/pid/status with this kind of insane oops. Maybe I misread your oops, but that just all looks completely bogus. Even if the stack got corrupted and/or unmapped, how did %cr2 get that odd "0000000000000019" fault address? None of this makes any sense at all to me. What CPU is this on? There was the crazy AMD microcode bug. This looks even more random, because now the registers look fine, and the oops just looks bad. Do you have other versions of the oops for this same problem? Linus