On Jul 2 15:25, Ken Brown wrote: > On 7/2/2015 8:20 AM, Corinna Vinschen wrote: > >On Jul 2 14:13, Corinna Vinschen wrote: > >>On Jul 1 22:10, Ken Brown wrote: > >>>I may have spoken too soon. As I repeat the experiment on a different > >>>computer, with a build from a slightly different snapshot of the emacs > >>>trunk, emacs crashes when I type 'C-x d' with the following stack dump: > >>> > >>>Stack trace: > >>>Frame Function Args > >>>00100A3E240 00180071CC3 (00000829630, 000008296D0, 00000000000, > >>>0000082CE00) > >>>00030000002 001800732BE (00000000000, 00000000002, 00100A48C80, > >>>00000000002) > >>>00000000000 00000006B40 (00000000002, 00100A48C80, 00000000002, > >>>00100A48768) > >>>00000000000 21000000003 (00000000002, 00100A48C80, 00000000002, > >>>00100A48768) > >>>End of stack trace > >>> > >>>$ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg > >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175 > >>> > >>>$ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg > >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639 > >> > >>That points to a crash while setting up the alternate stack. This is > >>always a possibility because, in contrast to the kernel signal handler > >>in a real POSIX system, the Cygwin exception handler is still running on > >>the stack which triggered the crash up to the point where we call the > >>signal handler function. Dependent on how the stack overflow occured, > >>this additional stack usage may be enough to kill the process for good. > >> > >>Out of curiosity, can you add this to the init_sigsegv() function: > >> > >> #include <windows.h> > >> [...] > >> init_sigsegv (void) > >> { > >> [...] > >> SetThreadStackGuarantee (65536); > > > >Of course this only works "per thread", so if init_sigsegv is called > >for the main thread, only the main thread gets this treatment. For > >testing this should be enough, though. > > That didn't make any difference.
It should have. If you don't also tweak STACK_DANGER_ZONE accordingly, handle_sigsegv should fail to call siglongjmp. Either way, I tested it locally as well, and it doesn't work. In the meantime I found that there's another problem. Assuming you longjmp out of handle_sigsegv, the stack will still be "broken". It doesn't have the usual guard pages anymore, and the next time you have a stack overflow, NTDLL will simply terminate the process. I create a wrapper function which resets the stack so it has valid guard pages again and then the stack overflow can be handled repeatedly. While I was at it, I found that the setup for pthread stacks is not quite right, either, so right now I'm hacking on this stuff to make it behave as expected in the usual cases. > But I do have a little more information. > I tried running emacs under gdb with a breakpoint at handle_sigsegv. The > breakpoint is hit when I deliberately trigger the stack overflow. Then I > continue, emacs says it has recovered from the stack overflow, and I type > 'C-x d'. At this point there's a second SIGSEGV and handle_sigsegv is > called again. But this time garbage collection is in progress, and > handle_sigsegv just gives up. Sounds right to me. > I don't know what caused the second SIGSEGV but I'll try to figure that out > when I next have a chance to look at this. I also don't know why the stack > dump pointed to a crash while setting up the alternate stack, since the > fatal crash actually seems to have happened later. But maybe the stack was > just completely messed up after the second SIGSEGV and the stack dump can't > be trusted. > > More later. Thanks! Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
pgppZowNuzHTt.pgp
Description: PGP signature