On Tue, Aug 28, 2018 at 6:25 PM Borislav Petkov <b...@suse.de> wrote: > > On Tue, Aug 28, 2018 at 05:49:01PM +0200, Jann Horn wrote: > > show_opcodes() is used both for dumping kernel instructions and for dumping > > user instructions. If userspace causes #PF by jumping to a kernel address, > > show_opcodes() can be reached with regs->ip controlled by the user, > > pointing to kernel code. > > Yap, and people keep asking how to dump the running kernel, after > patching and jump labels and stuff... Here's how! > > :-)))) > > > Make sure that userspace can't trick us into > > dumping kernel memory into dmesg. > > > > Cc: sta...@vger.kernel.org > > Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function") > > I think this one is more likely: > > ba54d856a9d8 ("x86/fault: Dump user opcode bytes on fatal faults") > > as it added the dumping of user opcode bytes.
No, you can also get user opcode bytes printed by WARN() and friends. When you add a WARN() in the pagefault handler, you get something like this. The first "Code:" line is from ba54d856a9d8, but the second one further down is from before that. [ 125.564041] segfault[1602]: segfault at ffffffff854340c0 ip ffffffff854340c0 sp 00007ffd4cc7a568 error 15 [ 125.569923] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <63> 6f 72 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 125.576859] ------------[ cut here ]------------ [ 125.578406] TESTING WARN() [ 125.578439] WARNING: CPU: 6 PID: 1602 at arch/x86/mm/fault.c:894 __bad_area_nosemaphore+0x147/0x270 [ 125.582172] Modules linked in: bpfilter [ 125.583394] CPU: 6 PID: 1602 Comm: segfault Tainted: G W 4.18.0+ #108 [ 125.585811] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 125.588410] RIP: 0010:__bad_area_nosemaphore+0x147/0x270 [ 125.590078] Code: 48 89 d9 48 89 ea 44 89 e6 48 c7 83 30 0b 00 00 0e 00 00 00 bf 0b 00 00 00 e8 f5 eb ff ff 48 c7 c7 00 61 66 84 e8 79 11 05 00 <0f> 0b 48 83 c4 28 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 83 c4 28 4c [ 125.595779] RSP: 0018:ffff8801cb3b7e18 EFLAGS: 00010286 [ 125.597426] RAX: 0000000000000000 RBX: ffff8801cbb9e000 RCX: 0000000000000000 [ 125.599605] RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffff86678ea0 [ 125.601800] RBP: ffffffff854340c0 R08: ffffed003d873ed5 R09: ffffed003d873ed5 [ 125.603935] R10: 0000000000000001 R11: ffffed003d873ed4 R12: 0000000000000001 [ 125.606113] R13: 0000000000000000 R14: 0000000000000015 R15: ffff8801cb3b7f58 [ 125.608250] FS: 00007fe30d518700(0000) GS:ffff8801ec380000(0000) knlGS:0000000000000000 [ 125.610608] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 125.612331] CR2: ffffffff854340c0 CR3: 00000001d563e001 CR4: 00000000003606e0 [ 125.614470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 125.616607] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 125.618736] Call Trace: [ 125.619475] __do_page_fault+0x133/0x780 [ 125.620646] ? mm_fault_error+0x1b0/0x1b0 [ 125.622236] ? async_page_fault+0x8/0x30 [ 125.623388] async_page_fault+0x1e/0x30 [ 125.624526] RIP: 0033:core_pattern+0x0/0x880 [ 125.625786] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <63> 6f 72 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 125.631208] RSP: 002b:00007ffd4cc7a568 EFLAGS: 00010202 [ 125.632737] RAX: ffffffff854340c0 RBX: 0000000000000000 RCX: 0000000000000000 [ 125.635039] RDX: 00007ffd4cc7a678 RSI: 00007ffd4cc7a668 RDI: 0000000000000001 [ 125.637088] RBP: 00007ffd4cc7a580 R08: 0000562d395106f0 R09: 00007fe30d323cb0 [ 125.639153] R10: 0000000000000000 R11: 00007fe30d0d23c0 R12: 0000562d39510530 [ 125.641183] R13: 00007ffd4cc7a660 R14: 0000000000000000 R15: 0000000000000000 [ 125.643221] ---[ end trace fb20716f9d6369bd ]--- > > Reviewed-by: Kees Cook <keesc...@chromium.org> > > Signed-off-by: Jann Horn <ja...@google.com> > > --- > > v2: Andy pointed out that I probably shouldn't be doing wrapping > > arithmetic on pointers. > > > > arch/x86/include/asm/stacktrace.h | 2 +- > > arch/x86/kernel/dumpstack.c | 13 ++++++++++--- > > arch/x86/mm/fault.c | 2 +- > > 3 files changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/include/asm/stacktrace.h > > b/arch/x86/include/asm/stacktrace.h > > index b6dc698f992a..f335aad404a4 100644 > > --- a/arch/x86/include/asm/stacktrace.h > > +++ b/arch/x86/include/asm/stacktrace.h > > @@ -111,6 +111,6 @@ static inline unsigned long caller_frame_pointer(void) > > return (unsigned long)frame; > > } > > > > -void show_opcodes(u8 *rip, const char *loglvl); > > +void show_opcodes(struct pt_regs *regs, const char *loglvl); > > void show_ip(struct pt_regs *regs, const char *loglvl); > > #endif /* _ASM_X86_STACKTRACE_H */ > > diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c > > index 9c8652974f8e..14b337582b6f 100644 > > --- a/arch/x86/kernel/dumpstack.c > > +++ b/arch/x86/kernel/dumpstack.c > > @@ -89,14 +89,21 @@ static void printk_stack_address(unsigned long address, > > int reliable, > > * Thus, the 2/3rds prologue and 64 byte OPCODE_BUFSIZE is just a random > > * guesstimate in attempt to achieve all of the above. > > */ > > -void show_opcodes(u8 *rip, const char *loglvl) > > +void show_opcodes(struct pt_regs *regs, const char *loglvl) > > { > > #define PROLOGUE_SIZE 42 > > #define EPILOGUE_SIZE 21 > > #define OPCODE_BUFSIZE (PROLOGUE_SIZE + 1 + EPILOGUE_SIZE) > > u8 opcodes[OPCODE_BUFSIZE]; > > + u8 *prologue = (u8 *)(regs->ip - PROLOGUE_SIZE); > > + /* > > + * Make sure userspace isn't trying to trick us into dumping kernel > > + * memory by pointing the userspace instruction pointer at it. > > + */ > > + bool bad_ip = user_mode(regs) && > > + __range_not_ok(prologue, OPCODE_BUFSIZE, TASK_SIZE_MAX); > > > > Ok, can we pls move the sole dumping of opcodes in a helper called, > __show_opcodes(), for example, which the checking wrapper show_opcodes() > - without the "__" prefix - calls? > > So that show_signal_msg() can call the checking variant - show_opcodes() > - as userspace might be doing monkey business there and we definitely > wanna check first but __show_regs() can call the non-checking variant > __show_opcodes() because there we wanna dump whatever rIP points to > because we wanna know if the machine has gone off into the weeds etc, > when staring at splats. > > Or am I missing a security aspect here? See above. I'm checking for user_mode(regs), so as long as CS has a kernel segment loaded, my patch shouldn't change anything, no matter where RIP points.