On 7/3/2015 6:47 AM, Corinna Vinschen wrote:
On Jul 2 15:25, Ken Brown wrote:
On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
On Jul 2 14:13, Corinna Vinschen wrote:
On Jul 1 22:10, Ken Brown wrote:
I may have spoken too soon. As I repeat the experiment on a different
computer, with a build from a slightly different snapshot of the emacs
trunk, emacs crashes when I type 'C-x d' with the following stack dump:
Stack trace:
Frame Function Args
00100A3E240 00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
00030000002 001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
00000000000 00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
00000000000 21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
End of stack trace
$ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
$ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
That points to a crash while setting up the alternate stack. This is
always a possibility because, in contrast to the kernel signal handler
in a real POSIX system, the Cygwin exception handler is still running on
the stack which triggered the crash up to the point where we call the
signal handler function. Dependent on how the stack overflow occured,
this additional stack usage may be enough to kill the process for good.
Out of curiosity, can you add this to the init_sigsegv() function:
#include <windows.h>
[...]
init_sigsegv (void)
{
[...]
SetThreadStackGuarantee (65536);
Of course this only works "per thread", so if init_sigsegv is called
for the main thread, only the main thread gets this treatment. For
testing this should be enough, though.
That didn't make any difference.
It should have. If you don't also tweak STACK_DANGER_ZONE accordingly,
handle_sigsegv should fail to call siglongjmp. Either way, I tested
it locally as well, and it doesn't work.
In the meantime I found that there's another problem. Assuming you
longjmp out of handle_sigsegv, the stack will still be "broken".
It doesn't have the usual guard pages anymore, and the next time
you have a stack overflow, NTDLL will simply terminate the process.
I create a wrapper function which resets the stack so it has valid guard
pages again and then the stack overflow can be handled repeatedly.
While I was at it, I found that the setup for pthread stacks is not
quite right, either, so right now I'm hacking on this stuff to make
it behave as expected in the usual cases.
But I do have a little more information.
I tried running emacs under gdb with a breakpoint at handle_sigsegv. The
breakpoint is hit when I deliberately trigger the stack overflow. Then I
continue, emacs says it has recovered from the stack overflow, and I type
'C-x d'. At this point there's a second SIGSEGV and handle_sigsegv is
called again. But this time garbage collection is in progress, and
handle_sigsegv just gives up.
Sounds right to me.
I don't know what caused the second SIGSEGV but I'll try to figure that out
when I next have a chance to look at this. I also don't know why the stack
dump pointed to a crash while setting up the alternate stack, since the
fatal crash actually seems to have happened later. But maybe the stack was
just completely messed up after the second SIGSEGV and the stack dump can't
be trusted.
I think I found the cause of that second SIGSEGV, and, if I'm right, it has
nothing to do with Cygwin. I think the problem was that in my testing, I forgot
to reset max-specpdl-size and max-lisp-eval-depth to reasonable values after the
recovery from stack overflow. If I do that, then I can no longer reproduce the
crash.
For the record, here's my complete elisp test case:
(setq max-specpdl-size 83200000
max-lisp-eval-depth 640000)
(defun foo () (foo))
(foo)
;; The stack has now overflowed, and emacs has recovered.
(setq max-specpdl-size 1300
max-lisp-eval-depth 800)
;; Can now continue working.
Ken
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple