-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 According to Tom G. Christensen on 8/30/2008 7:23 AM:
>> At any rate, I'm also interested in seeing how c-stack behaves if you >> don't link with libsigsegv (in the earlier snapshot, your build showed >> that c-stack still attempted stack overflow detection and crashed, causing >> ./stackovf.test to FAil; my hope is that I've fixed things so that in this >> snapshot so that c-stack returns ENOTSUPP, letting ./stackovf.test Skip). >> > Yes, it seems to do the right thing now. > > c_stack_action: Invalid argument > SKIP: test-c-stack.sh > c_stack_action: Invalid argument > SKIP: test-c-stack2.sh Good. We've now proven that stack overflow detection is skipped on platforms where we don't have it ported, rather than causing failures. It's now only a problem of whether we can easily port things to Irix; and it's less work if we can just focus on the libsigsegv side of things. >> Can you include the config.log snippet for that portion of the configure >> run, so we can see if it was a compile or run failure? >> > configure:6499: checking for working C stack overflow detection > configure:6591: cc -o conftest -g -I/usr/tgcware/include -L/usr/tgcware/lib > -Wl,-rpath,/usr/tgcware/lib conftest.c >&5 > cfe: Warning 728: conftest.c, line 62: Long double not supported; double > assumed. > long double ld; > ---^ How annoying. The compiler supports 'long double', per C89 (since long double and double are allowed to be the same type), but warns you any time you use it. At any rate, that warning is ignorable (it appears in lots of places in your logs). > configure:6595: $? = 0 > configure:6601: ./conftest > ./configure[6603]: 7578 Memory fault(coredump) > configure:6605: $? = 139 > configure: program exited with status 139 Runtime failure. Oh well - I'm not sure what libsigsegv is doing to avoid this, but I'm glad that it's not a compile-time failure, and that without libsigsegv, the stack overflow tests are skipped. > I've been using dbx but I have a copy of gdb 4.17 aswell. > Not sure how that would compare to gdb 6.x though. > Running dbx with the corefile gives this backtrace: I'm not familiar with dbx, but hopefully can offer enough advice. And I'm not sure how much gdb has improved since 4.17. > [EMAIL PROTECTED] src]$ dbx conftest > dbx version 3.19 Nov 3 1994 19:59:46 > Core from signal SIGSEGV: Segmentation violation No mention of the handler, unlike in your other backtrace. It looks like the non-libsigsegv approach dumps core even before a handler gets a chance to run. > (dbx) t >> 0 recurse(p = (nil)) ["/usr/people/tgc/buildpkg/m4/src/conftest.c":96, >> 0x400bec] > 1 recurse(p = 0x7ff0038c = "\001") > ["/usr/people/tgc/buildpkg/m4/src/conftest.c":99, 0x400c10] > 2 recurse(p = 0x7ff005a4 = "\001") > ["/usr/people/tgc/buildpkg/m4/src/conftest.c":99, 0x400c10] Hmm, based on the pattern (each frame occupies 536 bytes), I would have expected the debugger to report that p = 0x7ff00174 rather than (nil) in the final frame before the stack overflow; but that may be a debugger anomaly. I just noticed that the c-stack.m4 file doesn't check the return status from sigaction. I suspect it worked for you, but just to be sure, could you retry this with this patch to the program in config.log: @@ -85,8 +85,7 @@ AC_DEFUN([AC_SYS_XSI_STACK_OVERFLOW_HEURISTIC], setrlimit (RLIMIT_STACK, &rl); #endif - - c_stack_action (); - - return recurse ("\1"); + return c_stack_action () && recurse ("\1"); } > I cut out 8->1015, they're exactly the same, just the value for p changes. Expected - we are inducing stack overflow by rapidly stepping through the stack; the only interesting things are the newest one or two frames at the point where stack overflow occurred. >> Regardless of those test results, you >> should now be able to run the just-built debugging version of >> 'tests/test-c-stack'. Run without arguments to trigger stack overflow, >> and with any arguments (contents don't matter, just that argc>1) to >> trigger an unrelated segv; it would be nice to step through both of those >> cases in the debugger and see why they are dying abruptly rather than >> detecting the problem and printing a nice status. >> > I tried to do this but I after poking at it for awhile I realised I > have no idea what I should be looking for, what I should be stepping > through or even how to do it right :( > > What I found if anything is that running test-c-stack with no arguments > results in the same backtrace as above. > Running it with arguments results in: > Executable > /usr/people/tgc/buildpkg/m4/src/m4-1.4.11.42-864d/tests/test-c-stack > (dbx) run blah > Process 18820 (test-c-stack) started > Process 18820 (test-c-stack) stopped on signal SIGSEGV: Segmentation > violation (handler sigsegv_handler) [main:73 +0x18,0x400e78] Notice how this instance referenced a handler. I imagine that if you continued single-stepping, you would then step through the statements in sigsegv_handler. At any rate, gdb does the same thing - it tells where SIGSEGV occurs, and then proceeding to single step goes through the handler. Here's my sample session, on cygwin; I found that adding a breakpoint on overflow_handler (the callback that got registered with stackoverflow_install_handler) was important to see what I wanted: $ ./test-c-stack overflow_handler emergency=0 segv_handler_missing=0 ./test-c-stack: stack overflow $ ./test-c-stack 1 segv_handler serious=1 ./test-c-stack: program error Segmentation fault $ gdb ./test-c-stack GNU gdb 6.8.0.20080328-cvs (cygwin-special) ... (gdb) b overflow_handler Breakpoint 1 at 0x40128a: file c-stack.c, line 184. (gdb) r Starting program: /home/eblake/m4-branch/tests/test-c-stack.exe [New thread 4368.0x17a4] [New thread 4368.0x1520] Program received signal SIGSEGV, Segmentation fault. recurse (p=0x4303c "\001") at test-c-stack.c:48 48 array[0] = 1; (gdb) c Continuing. Breakpoint 1, overflow_handler (emergency=0, context=0x407d74) at c-stack.c:184 184 sprintf (buf, "overflow_handler emergency=%d segv_handler_missing=%d\n", (gdb) n 186 write (STDERR_FILENO, buf, strlen (buf)); (gdb) n overflow_handler emergency=0 segv_handler_missing=0 190 die ((!emergency || segv_handler_missing) ? 0 : SIGSEGV); (gdb) s die (signo=0) at c-stack.c:108 108 segv_action (signo); (gdb) n 109 message = signo ? program_error_message : stack_overflow_message; (gdb) n 110 write (STDERR_FILENO, program_name, strlen (program_name)); (gdb) n /home/eblake/m4-branch/tests/test-c-stack111 write (STDERR_FILENO, ": ", 2); (gdb) n : 112 write (STDERR_FILENO, message, strlen (message)); (gdb) n stack overflow113 write (STDERR_FILENO, "\n", 1); (gdb) n 114 if (! signo) (gdb) n 115 _exit (exit_failure); (gdb) n Program exited with code 01. What I'm suspecting is that since m4's use of libsigsegv passed, but c-stack's did not, that something in c-stack's overflow_handler is triggering a secondary segv. Installing a breakpoint on overflow_handler will show whether we actually get there on stack overflow (I hope so, since that is all libsigsegv, which is working on your platform), and if so, where it is dying from the secondary segv. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAki79VYACgkQ84KuGfSFAYBOdACgu4/F+ge7SHn1SL0WMEFKJx2d gdwAoKN2lnOVpMyhSJ7iFXtLHndEZFxo =Fmdn -----END PGP SIGNATURE-----