I should have said disabled by default...
On Jan 21, 2007, at 5:21 PM, Ian Eslick wrote:
How often do users of Elephant rely on BTrees where the symbols are
ordered according to the symbol's name?
Would it be terribly inconvenient to you if you had to convert
symbol keys to strings to get an alphabetical ordering, but were
still assured of contiguity of identical symbol keys in secondary
btrees?
The argument is that we serialize symbols to strings all the time
(slot access, etc) and this engenders a great deal of overhead.
Most symbol serialization is highly redundant and can be factored
out by assigning a persistent ID to each symbol as we do with
persistent objects. This results in significantly less disk space
(a constant vs. N*char_width bits for every slot value), reduced IO
bandwidth (and increased locality), and less serialization/
deserialization time.
The downside is that the C function passed to BDB which is used to
compare two strings so that the BTree is ordered does not have
access to the persistent table. Thus we can't order strings
according to their characters, but only according to their ID which
means a random order. Of course symbols will be identical to
themselves and so will be grouped together in duplicate indices.
Sorting according to the characters of the symbol may be possible,
but there are a number of implications that require some thinking
about and I want to put this off for now.
As this is a user configurable option (my-config.sexp) and will be
enabled by default I don't think there is any harm in promoting this.
Comments?
Thank you,
Ian
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel