[elephant-devel] BTREE Sorting on Symbol Strings?

Ian Eslick Sun, 21 Jan 2007 14:22:00 -0800

How often do users of Elephant rely on BTrees where the symbols areordered according to the symbol's name?

Would it be terribly inconvenient to you if you had to convert symbolkeys to strings to get an alphabetical ordering, but were stillassured of contiguity of identical symbol keys in secondary btrees?

The argument is that we serialize symbols to strings all the time(slot access, etc) and this engenders a great deal of overhead. Mostsymbol serialization is highly redundant and can be factored out byassigning a persistent ID to each symbol as we do with persistentobjects. This results in significantly less disk space (a constantvs. N*char_width bits for every slot value), reduced IO bandwidth(and increased locality), and less serialization/deserialization time.

The downside is that the C function passed to BDB which is used tocompare two strings so that the BTree is ordered does not have accessto the persistent table. Thus we can't order strings according totheir characters, but only according to their ID which means a randomorder. Of course symbols will be identical to themselves and so willbe grouped together in duplicate indices.

Sorting according to the characters of the symbol may be possible,but there are a number of implications that require some thinkingabout and I want to put this off for now.

As this is a user configurable option (my-config.sexp) and will beenabled by default I don't think there is any harm in promoting this.


Comments?

Thank you,
Ian
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

[elephant-devel] BTREE Sorting on Symbol Strings?

Reply via email to