On Mon, Mar 26, 2018 at 11:31 PM, Florian Weimer <fwei...@redhat.com> wrote: > On 03/27/2018 12:43 AM, H.J. Lu wrote: >> >> On Linux, when alternate signal stack is used with thread cancellation, >> _Unwind_Resume fails when it tries to unwind shadow stack from signal >> handler on alternate signal stack. The issue is that signal handler on >> alternate signal stack uses a separate shadow stack and we must switch >> to the original shadow stack to unwind it. But frame count will be wrong >> in this case. For thread cancellation, there is no need to unwind shadow >> stack since it will long jump back and exit. >> >> One possibility is >> >> 1. Add _URC_NO_REASON_CANCEL. >> 2. unwind_stop in libpthread returns _URC_NO_REASON_CANCEL. >> 3. _Unwind_ForcedUnwind_Phase2 sets frames to 1 for >> _URC_NO_REASON_CANCEL > > > I assume the sequence of events goes like this: > > 1. The program receives a signal with a SA_ONSTACK handler. > 2. The program switches to the alternate signal stack (including an > alternate shadow stack) and runs the handler. > 3. The handler reaches a cancellation point. > 4. Cancellation is acted upon. > > During unwinding, INCSSP is executed as needed. The switch from the > alternate signal stack is implicit in the SP register restore. But there is > no corresponding stack switch back to the original shadow stack. This means > that INCSSP faults once the alternate stack is empty. > > Is this description accurate?
That is correct. > I think this has to be fixed entirely within the libgcc unwinder. Otherwise, > any application which throws from a (synchronous) signal handler will have > the same issue, and I think this is something we need to support. There are 2 ways to unwind shadow stack: 1. setjmp saves shadow stack register and longjmp pops shadow stack until shadow stack register matches the saved value. To support longjmp from signal handler, we make a syscall to restore the original shadow stack. 2. Since shadow stack is never saved and restored by compiler, unwinder in libgcc counts how many stack frame it has to unwind and uses INCSSP to pop shadow stack. This can't unwind the original shadow stack when the alternate shadow stack is used. _URC_NO_REASON_CANCEL works only if longjmp will be used to finish stack unwinding, which is the case for thread cancellation in glibc. Here are patches for GCC: https://github.com/hjl-tools/gcc/commit/e9ff815941406e38fa629947af4d809b9129e860 and glibc: https://github.com/hjl-tools/glibc/commit/1aec81528ab26aa8a8a7965317b6e1a8ba4526aa They fixed the issue. > It may be possible to implement this without kernel changes: Patch the > interrupted context to continue unwinding, and then call sigreturn to switch > both stacks at the same time. > We passed almost all 5000+ tests in glibc with shadow stack and indirect branch tracking enabled. The only remaining failures are thread cancellation with alternate signal stack and -fasynchronous-unwind-tables. I couldn't find a way to unwind shadow stack by counting stack frame when exception happens in alternate signal stack. -- H.J.