On 01/25/2018 02:38 PM, H.J. Lu wrote:
On Thu, Jan 25, 2018 at 12:32 AM, Florian Weimer <fwei...@redhat.com> wrote:
On 01/22/2018 01:21 PM, Florian Weimer wrote:

There is a different issue with the think itself.

__x86_indirect_thunk_rax:
.LFB2:
          .cfi_startproc
          call    .LIND5
.LIND4:
          pause
          lfence
          jmp     .LIND4
.LIND5:
          mov     %rax, (%rsp)
          ret
          .cfi_endproc

If a signal is delivered after the mov has executed, the unwinder will
eventually unwind through the signal frame and hit __x86_indirect_thunk_rax.
It does not treat it as a signal frame, so the return address of the stack
is decremented by one, in an attempt to obtain a program counter value which
is within the call instruction. However, in this scenario, the return
address is actually the start of the function, and subtracting one moves the
program counter out of the unwind region for that function.


I think it is possible to fix the second case by hiding the the return
address at the top of the stack, like this:

__x86_indirect_thunk_rax:
.LFB2:
         .cfi_startproc
         call    .LIND5
.LIND4:
         pause
         lfence
         jmp     .LIND4
.LIND5:
         .cfi_def_cfa_offset 16
         mov     %rax, (%rsp)
         ret
         .cfi_endproc

The unwinder should then use the other return address, from the caller of
the thunk routine.

Can you open a GCC bug to track it?

Sure, I filed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84039

As mentioned on the bug, we now have a reported of a potential kernel issue related to retpolines and unwinding, but it's not clear that the thunk routine is at fault (which would be supplied by the kernel anyway).

Thanks,
Florian

Reply via email to