On 1 March 2016 at 09:38, Hunter Laux <hunterl...@gmail.com> wrote: > I was having trouble running golang on linux-user with an aarch64 target. > > It turns out that snappy is written in Go. When I tried the xenial aarch64 > preinstall image in qemu, Snappy was broken. > > For some reason, it calls sigaction on all the signals. > > I noticed do_sigaction in linux-user/signal.c calls the host sigaction. > Unfortunately, glibc blocks signal 33 and for "SIGSETXID", which I guess is > just a user signal that pthread names, but thats as far as I got into it. > > See here: > https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/sigaction.c > > Maybe there's some simple cleaner solutions, but here's what I did. > > I made a quick fix by calling the __libc_sigaction instead of the > __sigaction to bypass the check for SIGSETXID. It seems to work, but I'm not > sure if that's safe. My "one liner" is definitely a hack.
This has been a bug for ages and we probably should try to fix it. The problem is that the go runtime expects to be able to register a handler for every signal, but since QEMU is written in C and uses libc we can't register a handler for some signals that libc needs to use. SIGSETXID is signal 33, which glibc uses as part of its setuid/setgid handling. We can't call __libc_sigaction because then the libc threading code would no longer be able to correctly handle setuid/setgid -- I suspect that if your guest Go program makes a setuid syscall with your patch applied and multiple threads present this will result in it hanging. This was traditionally a problem only for guests which try to register SIGRTMAX (64), because QEMU has a hack where it swaps the guest's SIGRTMIN and SIGRTMAX. This allows guests using libc to think they have a working SIGRTMIN -- SIGRTMIN is the other libc-internal signal (used for thread cancellation) -- without interfering with the host libc use of it. This did cause go to complain, but now the go runtime has a workaround for it whereby it ignores signal 64 failures: https://go-review.googlesource.com/#/c/16853/3/src/runtime/os1_linux.go SIGSETXID is I think newer, but similar issues apply. In fact the only reason we haven't noticed problems already with the glibc runtime is that it ignores failure return when setting up the signal handler(!): https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/nptl-init.c;h=bdbdfedcef956bca51b9473674381d36eac2c751;hb=HEAD#l411 [I think, but have not tested, that this will mean that attempts to use setuid etc in a program with multiple threads will hang under QEMU.] So we probably ought to: (1) redirect SIGSETXID from 33 up to SIGRTMAX-1, for the same reasons we redirect signal 32 up to SIGRTMAX (2a) consider returning success for attempts to register SIGRTMAX and SIGRTMAX-1 handlers in the guest, rather than failure (2b) alternatively, ask the go runtime maintainers to extend their 'ignore signal 64 registration failure' hack to cover 63 For 2a vs 2b, I checked what Valgrind does -- it also returns failure-EINVAL for its internal-use signal. (Valgrind only needs 64 for internal use, because it doesn't use glibc.) So I think we should go with (2b). > After I did that, I kept getting an EXCP_YIELD. I'm not sure how to handle > this, but ignoring it seems to work. Again, I'm not sure that's safe. This is pretty much the right thing. EXCP_YIELD is raised by the 'yield' instruction in order to cause execution of guest code to return to QEMU's main loop for system emulation so we can schedule a different guest CPU in an SMP config. In userspace emulation it's not so useful, but since the yield insn is runnable in usermode it can happen, and just ignoring it and continuing to run code is the right thing to do. I'll write up a proper patch that includes a comment about what's going on. thanks -- PMM