On 27 February 2017 at 23:47, Josh Poimboeuf <jpoim...@redhat.com> wrote: > On Mon, Feb 27, 2017 at 12:49:59PM +0800, Daniel J Blueman wrote: >> On 4.9.13 with KASAN enabled [1], we see a number of stack unwinding >> errors reported [2,3]. >> >> This seems to occur at half of boots. >> >> Let me know for further debug info/patch testing and thanks, >> Daniel >> >> [1] https://quora.org/config >> [2] https://quora.org/dmesg.txt > > Hi Daniel, > > Can you try the following patch? It's a backport of the following > upstream commit: > > 09ae68dd0a8d ("x86/unwind: Disable KASAN checks for non-current tasks") > > If it fixes it then I'll submit it for 4.9 stable. > > --- > > From: Josh Poimboeuf <jpoim...@redhat.com> > Subject: [PATCH] x86/unwind: Disable KASAN checks for non-current tasks > > There are a handful of callers to save_stack_trace_tsk() and > show_stack() which try to unwind the stack of a task other than current. > In such cases, it's remotely possible that the task is running on one > CPU while the unwinder is reading its stack from another CPU, causing > the unwinder to see stack corruption. > > These cases seem to be mostly harmless. The unwinder has checks which > prevent it from following bad pointers beyond the bounds of the stack. > So it's not really a bug as long as the caller understands that > unwinding another task will not always succeed. > > In such cases, it's possible that the unwinder may read a KASAN-poisoned > region of the stack. Account for that by using READ_ONCE_NOCHECK() when > reading the stack of another task. > > Use READ_ONCE() when reading the stack of the current task, since KASAN > warnings can still be useful for finding bugs in that case. > > Reported-by: Dmitry Vyukov <dvyu...@google.com> > Signed-off-by: Josh Poimboeuf <jpoim...@redhat.com> > Cc: Andy Lutomirski <l...@amacapital.net> > Cc: Andy Lutomirski <l...@kernel.org> > Cc: Borislav Petkov <b...@alien8.de> > Cc: Brian Gerst <brge...@gmail.com> > Cc: Dave Jones <da...@codemonkey.org.uk> > Cc: Denys Vlasenko <dvlas...@redhat.com> > Cc: H. Peter Anvin <h...@zytor.com> > Cc: Linus Torvalds <torva...@linux-foundation.org> > Cc: Miroslav Benes <mbe...@suse.cz> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Thomas Gleixner <t...@linutronix.de> > Link: > http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoim...@redhat.com > Signed-off-by: Ingo Molnar <mi...@kernel.org> > --- > arch/x86/include/asm/stacktrace.h | 5 ++++- > arch/x86/kernel/unwind_frame.c | 20 ++++++++++++++++++-- > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/stacktrace.h > b/arch/x86/include/asm/stacktrace.h > index 37f2e0b..4141ead 100644 > --- a/arch/x86/include/asm/stacktrace.h > +++ b/arch/x86/include/asm/stacktrace.h > @@ -55,13 +55,16 @@ extern int kstack_depth_to_print; > static inline unsigned long * > get_frame_pointer(struct task_struct *task, struct pt_regs *regs) > { > + struct inactive_task_frame *frame; > + > if (regs) > return (unsigned long *)regs->bp; > > if (task == current) > return __builtin_frame_address(0); > > - return (unsigned long *)((struct inactive_task_frame > *)task->thread.sp)->bp; > + frame = (struct inactive_task_frame *)task->thread.sp; > + return (unsigned long *)READ_ONCE_NOCHECK(frame->bp); > } > #else > static inline unsigned long * > diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c > index a2456d4..caff129 100644 > --- a/arch/x86/kernel/unwind_frame.c > +++ b/arch/x86/kernel/unwind_frame.c > @@ -6,6 +6,21 @@ > > #define FRAME_HEADER_SIZE (sizeof(long) * 2) > > +/* > + * This disables KASAN checking when reading a value from another task's > stack, > + * since the other task could be running on another CPU and could have > poisoned > + * the stack in the meantime. > + */ > +#define READ_ONCE_TASK_STACK(task, x) \ > +({ \ > + unsigned long val; \ > + if (task == current) \ > + val = READ_ONCE(x); \ > + else \ > + val = READ_ONCE_NOCHECK(x); \ > + val; \ > +}) > + > unsigned long unwind_get_return_address(struct unwind_state *state) > { > unsigned long addr; > @@ -14,7 +29,8 @@ unsigned long unwind_get_return_address(struct unwind_state > *state) > if (unwind_done(state)) > return 0; > > - addr = ftrace_graph_ret_addr(state->task, &state->graph_idx, *addr_p, > + addr = READ_ONCE_TASK_STACK(state->task, *addr_p); > + addr = ftrace_graph_ret_addr(state->task, &state->graph_idx, addr, > addr_p); > > return __kernel_text_address(addr) ? addr : 0; > @@ -48,7 +64,7 @@ bool unwind_next_frame(struct unwind_state *state) > if (unwind_done(state)) > return false; > > - next_bp = (unsigned long *)*state->bp; > + next_bp = (unsigned long *)READ_ONCE_TASK_STACK(state->task, > *state->bp); > > /* make sure the next frame's data is accessible */ > if (!update_stack_state(state, next_bp, FRAME_HEADER_SIZE))
Thanks Josh! With this patch, the KASAN warning still occurs, but at unwind_get_return_address+0x1d3/0x130 instead; the rest of the trace is identical. (gdb) list *(unwind_get_return_address+0x1d3) 0xffffffff8112bca3 is in unwind_get_return_address (./include/linux/compiler.h:243). 238 }) 239 240 static __always_inline 241 void __read_once_size(const volatile void *p, void *res, int size) 242 { 243 __READ_ONCE_SIZE; Thanks, Daniel -- Daniel J Blueman