Hi, I have a Go executable that uses a shared C library which spawns it own threads. In case of a crash (in Go or C code), I want to dump all stacktraces of all C threads and Go routines into a crash dump file. Go does not handle signals in non-Go threads executing non-Go code, so I have to install a custom C signal handler to handle those cases. And as Go does not invoke a preinstalled C handler in case of crashes in Go code, the C handler has to be registered after the Go handler.
After some experiments - restricted to Linux amd64 - I got it working somehow (https://gist.github.com/trxa/302c5dbe9055ef287da9139e68d0a93e). But it feels a bit hacky with some drawbacks and I wonder if somebody can propose a better solution or improvements. How it basically works: The Go handlers are stored when the C handler gets installed. If invoked, for example by a SIGSEGV, the handler opens a file and writes the stack trace of the current thread into that file. Then, it signals all other threads to dump their stack into the file too. After all threads are dumped, the IP of the failing instruction is saved and the Go handler is invoked by calling it directly to keep the ucontext of the crash. After the Go handler has returned, it is checked whether the IP of the uc_mcontext has been changed by Go. If it is changed, the IP points to runtime.sigpanic which triggers a panic and dumps the Go routine stacks to STDERR. If it is not changed, the crash was in non-Go code on a non-Go thread and Go does not handle the crash. In that case, the IP register in uc_mcontext is set to the function pointer of an exposed Cgo function which calls panic() to dump the stack to STDERR. Before returning from the C handler, the STDERR file descriptor is replaced by the crash dump file descriptor, so that Go panics into the file. (The Go handlers should probably be restored before returning, if Go still wants to backtrace the threads via SIGQUIT itself.) After the C handler has returned, runtime.sigaction or the cgo function is executed and does not return. Here are the disadvantages and things to watch out, which makes the solution a bit creepy: 1. signal.Notify has to be called for all signals you want to handle for C crashes, although they are not handled in Go. Otherwise the Go handler does not return in the "non-Go-code/thread" case, but creates a core dump. 2. Setting the IP to a cgo function to be executed when the handler returns, makes the program panicing synchronously, as with runtime.sigpanic, but is probably not async-signal-safe, for example if it has to request more stack. A workaround would be to panic in Go, if the signal is read from the notify channel. In addition, the C handler must not return to avoid reexecution of the faulting instruction. This can be done by putting the thread to sleep. Doing this is probably even more platform independent, but that way, a synchronous signal from C is handled as an asynchronous one and you don't have a chance to distinguish it in Go by the information you get (in case you only want to dump and continue for asynchronous signals). 3. Cloning the STDERR file descriptor to point to a file feels also a bit fragile compared to directly writing to it. Another thread might write to it. The fd cannot be closed (except maybe in a global destructor) and the OS would have to flush the buffers correctly (or I have to use synchronous write mode, which slows writing the dump down tremendously). 4. There are duplicate stack traces, and it's not always obvious to match a thread stack trace to the running go routine. 5. It would be desirable to have the stack trace of the failing instruction redundantly in the crash file and in the log file, but with this solution it is only possible for C frames and the first Go frame on top of the thread, at least if you use a common unwinder library. There might be more. >From my point of view, a better solution would be, when Go has an option (maybe via GOTRACEBACK env var) to trace C threads as well, for example by using the cgo traceback functions introduced in Go 1.7. Also setting a file descriptor/handle as target for a dump should be allowed (maybe in addition to the dump on STDERR). In addition to the cgo traceback functions, there might be one or more functions for gathering additional information, which will be printed in the crash dump. A use case for that would be a list of loaded modules/libraries or environment variables. I can imagine that it's easier said than done, but that's what I would prefer. Thanks for your opinions! Martin -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.