On 29.10.17 20:13, Konstantin Belousov wrote:
On Sun, Oct 29, 2017 at 06:23:51PM +0100, Tijl Coosemans wrote:
On Sat, 26 Aug 2017 21:40:34 +0300 Konstantin Belousov <kostik...@gmail.com> 
wrote:
On Sat, Aug 26, 2017 at 08:28:13PM +0200, Tijl Coosemans wrote:
I did consider using
a CFI directive (see patch below) and it works, but it's architecture
specific and it's inserted after the function prologue so there's still
a window of a few instructions where a stack unwinder will try to use
the return address.

Index: lib/libthr/thread/thr_create.c
===================================================================
--- lib/libthr/thread/thr_create.c      (revision 322802)
+++ lib/libthr/thread/thr_create.c      (working copy)
@@ -251,6 +251,7 @@ create_stack(struct pthread_attr *pattr)
  static void
  thread_start(struct pthread *curthread)
  {
+       __asm(".cfi_undefined %rip");
         sigset_t set;
if (curthread->attr.suspend == THR_CREATE_SUSPENDED)

I like this approach much more than the previous patch.  What can be
done is to provide asm trampoline which calls thread_start().  There you
can add the .cfi_undefined right at the entry.

It is somewhat more work than just setting the return address on the
kernel-constructed pseudo stack frame, but I believe this is ultimately
correct way.  You still can do it only on some arches, if you do not
have incentive to code asm for all of them.

Ok, but then there are two ways to implement the trampoline:

1)
        movq $0,(%rsp)
        jmp thread_start
2)
        subq $8,%rsp
        call thread_start
        /* NOTREACHED */

With 1) you're setting the return address to zero anyway, so you might
as well do that in the kernel like my first patch.  With 2) you're
setting up a new call frame, basically redoing what the kernel already
did and on i386 this also means copying the function argument.
I do not quite understand the second variant, because the stack is not
guaranteed to be zeroed, and it is often not if reused after the previously
exited thread.

The first variant is what I like, but perhaps we need to emulate the
frame as well, i.e. push two zero longs.

Currently kernel does not access the usermode stack for the new thread
unless dictated by ABI (i.e. it does not touch it for 64bit process
on amd64, but have to for 32bit).  I like this property.  Also, the
previous paragraph is indicative: we do not really know in kernel
what ABI the userspace follows.  It might want frame, may be it does
not need it.  It could use other register than %rbp as the frame base,
etc.


Do you have any preference (or better alternatives), because I think I
still prefer my first patch.  It's the caller's job to set up the call
frame, in this case the kernel.  And if the kernel handles it then it
also works with (hypothetical) implementations other than libthr.

Also crt1 probably should get the same treatment, despite we already set
%rbp to zero AFAIR.

I haven't checked but I imagine the return address of the process entry
point is always zero because the stack is all zeros.
Stack is not zero. The environment and argument strings and auxv are copied
at top, and at the bottom the ps_strings structure is located, so it
is not.

If you commit your existing patch as is, I will not resent.  But I do think
that stuff that can be done in usermode, should be done in usermode, esp.
when the amount of efforts is same.

Attached what I have for libgcc. It can be applied to gcc5-8, should give no issues. The mentioned tc from this thread and mine, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82635 do pass.

What do you think?

Andreas

Index: libgcc/config/i386/freebsd-unwind.h
===================================================================
--- libgcc/config/i386/freebsd-unwind.h (revision 254205)
+++ libgcc/config/i386/freebsd-unwind.h (working copy)
@@ -28,7 +28,10 @@
 
 #include <sys/types.h>
 #include <signal.h>
+#include <unistd.h>
+#include <sys/sysctl.h>
 #include <sys/ucontext.h>
+#include <sys/user.h>
 #include <machine/sigframe.h>
 
 #define REG_NAME(reg)  sf_uc.uc_mcontext.mc_## reg
@@ -36,38 +39,46 @@
 #ifdef __x86_64__
 #define MD_FALLBACK_FRAME_STATE_FOR x86_64_freebsd_fallback_frame_state
 
+static int
+x86_64_outside_sigtramp_range (unsigned char *pc)
+{
+  static int sigtramp_range_determined = 0;
+  static unsigned char *sigtramp_start, *sigtramp_end;
+
+  if (sigtramp_range_determined == 0)
+    {
+      struct kinfo_sigtramp kst = {0};
+      size_t len = sizeof (kst);
+      int mib[4] = { CTL_KERN, KERN_PROC, KERN_PROC_SIGTRAMP, getpid() };
+
+      sigtramp_range_determined = 1;
+      if (sysctl (mib, 4, &kst, &len, NULL, 0) == 0)
+      {
+        sigtramp_range_determined = 2;
+        sigtramp_start = kst.ksigtramp_start;
+        sigtramp_end   = kst.ksigtramp_end;
+      }
+    }
+  if (sigtramp_range_determined < 2)  /* sysctl failed if < 2 */
+    return 1;
+
+  return (pc < sigtramp_start || pc >= sigtramp_end);
+}
+
 static _Unwind_Reason_Code
 x86_64_freebsd_fallback_frame_state
 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
 {
   struct sigframe *sf;
-  long new_cfa;
+  _Unwind_Ptr new_cfa;
 
-  /* Prior to FreeBSD 9, the signal trampoline was located immediately
-     before the ps_strings.  To support non-executable stacks on AMD64,
-     the sigtramp was moved to a shared page for FreeBSD 9.  Unfortunately
-     this means looking frame patterns again (sys/amd64/amd64/sigtramp.S)
-     rather than using the robust and convenient KERN_PS_STRINGS trick.
-
-     <pc + 00>:  lea     0x10(%rsp),%rdi
-     <pc + 05>:  pushq   $0x0
-     <pc + 17>:  mov     $0x1a1,%rax
-     <pc + 14>:  syscall
-
-     If we can't find this pattern, we're at the end of the stack.
-  */
-
-  if (!(   *(unsigned int *)(context->ra)      == 0x247c8d48
-        && *(unsigned int *)(context->ra +  4) == 0x48006a10
-        && *(unsigned int *)(context->ra +  8) == 0x01a1c0c7
-        && *(unsigned int *)(context->ra + 12) == 0x050f0000 ))
+  if (x86_64_outside_sigtramp_range(context->ra))
     return _URC_END_OF_STACK;
 
   sf = (struct sigframe *) context->cfa;
   new_cfa = sf->REG_NAME(rsp);
   fs->regs.cfa_how = CFA_REG_OFFSET;
-  /* Register 7 is rsp  */
-  fs->regs.cfa_reg = 7;
+  fs->regs.cfa_reg =  __LIBGCC_STACK_POINTER_REGNUM__;
   fs->regs.cfa_offset = new_cfa - (long) context->cfa;
 
   /* The SVR4 register numbering macros aren't usable in libgcc.  */
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to