On 05/20/2011 10:00 AM, Neil Van Dyke wrote:
If someone came to you and said, "We're using PLT 4.2.5 with CGC and
JIT, and we are wondering whether reliability would be improved by
moving to Racket 5.x and/or moving to 3m and/or disabling 4.2.5's
JIT," what would you say?

Details... A big installation of PLT 4.2.5 (with CGC, and with JIT
enabled) has noticed a rare unexplained crash of the app. This is less
than 100.0000% reliability, which bothers us more than it would most
organizations. The app does still use old-style CGC C extension to
call one C library. The C library itself is widely used in industry,
and it not suspect. It's possible that the C extensions are doing
something wrong, although they have seemed solid for high volume for
years, and (though I did not write them myself) they seem to me to be
doing the right things for GC safety. It's also possible that the
Scheme or C code of the app is not handling all the conditions of the
library properly, and on rare occasions will use then use the library
in an invalid way, such as with a bad pointer or causing a vomit on
the heap or stack. This has occurred on multiple boring Linux servers,
so hardware is not suspect. We have not ruled out the possibility of a
freak bug in PLT.

We have set up core dumps and instrumented much of the code for
detailed logging, and attempting to stimulate the rare crash in a test
environment. We have also started some new rigorous analysis of the
bits of C code. But we're also wondering whether there are known
instability problems with the older PLT stuff we're using, and if we'd
be better off, *stability-wise*, moving to Racket 5.x, moving to 3m
(which probably means using FFI for our library, or replacing it with
pure Scheme Racket code), or disabling the 4.2.5 JIT.

Can't address your direct question, but here are some questions I've had luck with.
- are there multiple threads?  multiple processes?
- possible overflow somewhere?
- holding pointers after free?
- have you tried running under a tool like Valgrind or TotalView's memory debugger?

Later,
Daniel

_________________________________________________
 For list-related administrative tasks:
 http://lists.racket-lang.org/listinfo/users

Reply via email to