On Friday, July 10, 2015 at 12:05:55 AM UTC-7, Matthew Flatt wrote:
> At Thu, 9 Jul 2015 10:08:53 -0700 (PDT), Scott Bell wrote:
> > On Thursday, July 9, 2015 at 12:27:26 AM UTC-7, Matthew Flatt wrote:
> > > At Wed, 8 Jul 2015 18:23:32 -0700 (PDT), Scott Bell wrote:
> > > > On Wednesday, July 8, 2015 at 4:05:20 PM UTC-7, Scott Bell wrote:
> > > > > On Wednesday, July 8, 2015 at 3:48:22 PM UTC-7, neil wrote:
> > > > > > Does adding the executable pathname to the `gdb` command line 
> > > > > > (i.e., 
> > > > > > format `gdb EXECUTABLE-FILE CORE-FILE`) give you the symbols?
> > > > > 
> > > > > Ah, of course. Yes, now we're getting somewhere:
> > > > > 
> > > > > ...
> > > > >     #6  0x000000080153e004 in strlen () from /lib/libc.so.7
> > > > >     #7  0x0000000800a72bf3 in scheme_make_byte_string_without_copying 
> > > > > ()
> > > > >        from /usr/local/lib/libracket3m-6.1.1.so
> > > > >     #8  0x0000000800ade3f2 in c_to_scheme ()
> > > > >        from /usr/local/lib/libracket3m-6.1.1.so
> > > > >     #9  0x0000000800adee02 in ffi_do_call ()
> > > > >        from /usr/local/lib/libracket3m-6.1.1.so
> > > > > ...
> > > > > 
> > > > > That certainly points me in the direction of a bad FFI call.
> > > > 
> > > > When I say bad FFI call, I'm leaving open the possibility that
> > > > the issue may be in the FFI machinery as well as in our source,
> > > > but I should point out that the minimal amount of FFI code 
> > > > that we have is years old and has been running without issue.
> > > > We first observed this crash only a few months ago. 
> > > 
> > > Do you know what foreign function is being called?
> > > 
> > > If so, does it return a GC-managed string pointer, or is it one that's
> > > outside the GC's management?
> > 
> > I don't know which foreign function is being called, but 
> > we don't have a lot of them, and they're all to either
> > system or third-party libraries. Based on the stack trace 
> > of the crash, I'm assuming the relevant calls are the 
> > ones that return a byte string, so here they are:
> > 
> >   (_fun _string _string -> _bytes)
> > 
> >     The return value here is 'char *', and may be NULL on
> >     failure. Racket seems to handle this correctly based
> >     on manual testing and returns #f in the failure case.
> > 
> >   (_fun [EVP_MD : _fpointer = (evp-sha1)]
> >         [key : _bytes]
> >         [key_len : _int = (bytes-length key)]
> >         [data : _bytes]
> >         [data_len : _int = (bytes-length data)]
> >         [md : (_bytes o 20)]
> >         [md_len : (_ptr o _uint)]
> >         -> _bytes
> >         -> md)
> > 
> >     This is the function signature for HMAC from libcrypto,
> >     and in fact this is in Racket at:
> >         web-server-lib/web-server/stuffers/hmac-sha1.rkt
> >     The return value should be the same pointer passed in
> >     as `md` or NULL on failure.
> > 
> > So it looks like one call has callee allocated memory, and
> > the other is allocated by Racket using the (_bytes o n)
> > custom function type. I'm not sure whether or not the 
> > latter is GC-managed.
> 
> I see that the byte string produced by `(_bytes o n)` is not really
> compatible with `_bytes` as a return type. The problem is that `(_bytes
> o 20)` allocates only 20 bytes, instead of 21 bytes with the last byte
> as a terminator. It's possible that the allocated bytes are not zeroed,
> and attempting to get the length of the byte string as a result runs
> off the edge of an allocated page to unallocated space.
> 
> Does using this definition of `_bytes*` (which is exported from
> `ffi/unsafe` as `_bytes`) change anything for your program?
> 
> (define-fun-syntax _bytes*
>   (syntax-id-rules (o)
>     [(_ o n) (type: _gcpointer
>               pre:  (let ([bstr (make-sized-byte-string (malloc (add1 n)) n)])
>                       (ptr-set! bstr _byte n 0)
>                       bstr)
>               ;; post is needed when this is used as a function output type
>               post: (x => (make-sized-byte-string x n)))]
>     [(_ . xs) (_bytes . xs)]
>     [_ _bytes]))

I've imported this change and will deploy shortly. This
seems like a probable fix. An alternative would be to 
construct the returned byte string using `md_len' and
avoid the call to strlen entirely. I'm not familiar enough
with the FFI to tell how straight-forward that would be.

Thanks Matthew, Neil, and Juan for the help! 

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to