On Thu, Oct 19, 2017 at 03:35:22PM -0700, Andrei Vagin wrote: > On Thu, Oct 19, 2017 at 01:16:55PM -0500, Josh Poimboeuf wrote: > > On Thu, Oct 19, 2017 at 09:51:04AM -0700, Andrei Vagin wrote: > > > Hi, > > > > > > We run CRIU tests for tip/auto-latest regularly, and a few days ago our > > > test job started to detect this warning in a kernel log: > > > > > > [ 44.235786] WARNING: can't dereference iret registers at > > > ffff8801c5f17fe0 for ip ffffffff95f0d94b > > > > > > What does it mean? How critical is it? > > > > > > Our test job fails if it detects any warning in a kernel log. Maybe we > > > need to investigate reasons of this warning and try to eliminate it? > > > > > > Here are logs: > > > https://travis-ci.org/avagin/linux/jobs/289676634 > > > > I think it means the unwinder found some bad ORC unwinder metadata. Any > > chance you have access to the kernel binary? I need to know what code > > corresponds to that ffffffff95f0d94b address. > > > > Or if you can reproduce with the following patch, that should help: > > > > > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c > > index 570b70d3f604..95b633f0ce51 100644 > > --- a/arch/x86/kernel/unwind_orc.c > > +++ b/arch/x86/kernel/unwind_orc.c > > @@ -448,7 +448,7 @@ bool unwind_next_frame(struct unwind_state *state) > > > > case ORC_TYPE_REGS_IRET: > > if (!deref_stack_regs(state, sp, &state->ip, &state->sp, > > false)) { > > - orc_warn("can't dereference iret registers at %p for ip > > %p\n", > > + orc_warn("can't dereference iret registers at %p for ip > > %pB\n", > > (void *)sp, (void *)orig_ip); > > goto done; > > } > > I applied your patch and rerun tests. > > [ 44.947699] WARNING: can't dereference iret registers at ffff880178f5ffe0 > for ip int3+0x5b/0x60
Thanks, that was enough for me to figure it out. Can you test the below fix? > and now here is a warning from kasan: > > [ 477.775676] > ================================================================== > [ 477.775845] BUG: KASAN: stack-out-of-bounds in deref_stack_reg+0x11d/0x150 The KASAN warning is a known issue for which the fix is a little more complicated. v1 of the patch was here: https://lkml.kernel.org/r/cover.1507128293.git.jpoim...@redhat.com diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 49167258d587..f6cdb7a1455e 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -808,7 +808,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 ENTRY(\sym) - UNWIND_HINT_IRET_REGS offset=8 + UNWIND_HINT_IRET_REGS offset=\has_error_code*8 /* Sanity check */ .if \shift_ist != -1 && \paranoid == 0