After some hard, clever work by my colleague, we've managed to narrow this one down a bit further.

First, we have compiled racket with optimization disabled, and we do have an all-zeroed Scheme_Jmpup_Buf. Please see our gdb session at http://pastebin.com/aBx2FTcK

Second, we've managed to consistently reproduce the segfault in a single line of code (a core dump of the racket session looks a lot like the one running with our code). The offending line is

(let loop () (thread (const '())) (loop))

Obviously, we don't have that exact line in our production code :) but it produces the same error more quickly and more consistently. Interestingly,

(let loop () (thread (thunk '())) (loop))

Does not produce a segfault.

We've caught this segv on racket compiled on an AWS machine and on the Mac OSX binaries distributed by you guys.




On 2013-05-03 02:04, Matthew Flatt wrote:
A stack overflow in scheme_uncopy_stack() sounds like a thread that is trying to jump to a continuation whose representation is corrupted. (An
all-zeroed Scheme_Jmpup_Buf could have that effect, but I don't
particularly trust gdb to tell us the actual content, unless you
disabled optimization when compiling `racket'.)

Assuming that the latest in the Racket git repo doesn't work any better for you --- and I don't expect that it does in this case --- if you can
send me something to run that provokes the crash, I can investigate
more.

At Thu, 02 May 2013 18:42:56 +0100, Matthew Eric Bassett wrote:
Hi all,

It might be better to send this to dev@racket-lang.  Then again, it
might be completely useless to them.

So we have a job scheduler program written in racket that handles
various places and tcp clients.  This program sporadically and
inconsistently terminates with the following error message:

SIGSEGV MAPERR si_code 1 fault on addr 0x7fffb044ef48

We've caught the error at various different point of execution, but
can't consistently reproduce it (yet). We do have a core dump of the
program running and terminating from the racket repl (loaded with
"enter!") v 5.3.3.  I've made it available via dropbox at
https://www.dropbox.com/s/rkd6pl511acll2r/core.12346.gz.

I don't have much experience reading core dumps, but it looks to me
like racket is hitting a stackoverflow in scheme_uncopy_stack in
setjmpup.c.

In particular, at the first time scheme_uncopy_stack appears in the
stack with args scheme_uncopy_stack (ok=0, b=0x7f857b2b3510,
prev=0x7fffb044f5e0) we have:

>>(gdb) p prev
$1 = (intptr_t *) 0x7fffb044f5e0
>>(gdb) p *prev
$2 = 0
>>(gdb) p b
$3 = (Scheme_Jumpup_Buf *) 0x7f857b2b3510
>>(gdb) p *b
$4 = {stack_from = 0x0, stack_copy = 0x0, stack_size = 0,
stack_max_size = 0, cont = 0x0, buf = {jb = {{jb = {{
__jmpbuf = {0, 0, 0, 0, 0, 0, 0, 0}, __mask_was_saved = 0,
__saved_mask = {__val = {
0 <repeats 16 times>}}}}, stack_frame = 0}}, gcvs = 0,
gcvs_cnt = 0}, gc_var_stack = 0x0,
   external_stack = 0x0}

scheme_uncopy_stack remains in the stack for several thousand frames.

The racket interpreter was compiled from source (so I don't know if
others can even read that coredump!) on a linux kernel
3.4.37-40.44.amzn1.x86_64 #1 SMP Thu Mar 21 01:17:08 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux with glibc/-devel
glibc-2.12-1.107.43.amzn1.x86_64.

I was able to capture the same error running from the MacOSX binaries
5.3.3 from racket-lang.org.  That core dump is available at
https://www.dropbox.com/s/jfneqr4zlkmkjhh/core.41166.gz.


Is this an error in racket? IF not, do you have any suggestions on how I can proceed in debugging this (I'm at a loss?) or even to figure out
which bits of my racket code to look at?  (I've tried doing "info
locals" from gdb at various points in the stack, but I've not reached
enlightenment.  Again, I have little experience with reading core
dumps).

I've not included any of our racket code, as we don't know which part
is causing the problem.

Thanks for reading,

--
Matthew Eric Bassett | http://mebassett.info
____________________
  Racket Users list:
  http://lists.racket-lang.org/users
____________________
  Racket Users list:
  http://lists.racket-lang.org/users

--
--
Matthew Eric Bassett | http://mebassett.info
____________________
 Racket Users list:
 http://lists.racket-lang.org/users

Reply via email to