Hi Neil! Neil Jerram <n...@ossau.uklinux.net> writes:
> l...@gnu.org (Ludovic Courtès) writes: >> Stringbufs and bytevectors are now always "inlined" in the BDW-GC >> branch [0, 1], which means that there's no cell->buffer indirection, >> which greatly simplifies code (it also takes less room and may slightly >> improve performance). >> >> The `scm_take_' functions for strings/symbols/bytevectors are now >> essentially aliases to the corresponding `scm_from_' because we cannot >> advantageously reuse the provided storage. > > That seems a bit of a shame. (i.e. that we can't advantageously keep > the caller's string or vector data) It’s not such a shame IMO because: * You have to allocate anyway, to store the (double) cell, and allocating the whole thing may be just as costly as allocating the cell, at least for small stringbufs/bytevectors. * For stringbufs, the user-provided buffer can be reused only if it’s either Latin-1 or UCS-4, anyway. * Removing the indirection and using only GC-managed memory is beneficial for Scheme code (which doesn’t use ‘scm_take’). * Reusing the malloc(3)-allocated buffer means that we have to register a finalizer to later free(3) that buffer (see, e.g., commit d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g., http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html). That said... > Did you consider the option of > > - always having an indirection from the stringbuf/bytevector object to > the underlying data ... this may be valuable (Andy pointed it out as well), at least for bytevectors. The indirection is a requirement for Andy’s SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still be supported; it’s also required if we want to provide mmap(3) bindings, for instance, that return a bytevector. For stringbufs, though, I’m happy if we can leave the code as it is. Thanks, Ludo’.