[elephant-devel] BTREE Sorting on Symbol Strings?
Ian Eslick
eslick at csail.mit.edu
Mon Jan 22 01:55:42 UTC 2007
I should have said disabled by default...
On Jan 21, 2007, at 5:21 PM, Ian Eslick wrote:
> How often do users of Elephant rely on BTrees where the symbols are
> ordered according to the symbol's name?
>
> Would it be terribly inconvenient to you if you had to convert
> symbol keys to strings to get an alphabetical ordering, but were
> still assured of contiguity of identical symbol keys in secondary
> btrees?
>
> The argument is that we serialize symbols to strings all the time
> (slot access, etc) and this engenders a great deal of overhead.
> Most symbol serialization is highly redundant and can be factored
> out by assigning a persistent ID to each symbol as we do with
> persistent objects. This results in significantly less disk space
> (a constant vs. N*char_width bits for every slot value), reduced IO
> bandwidth (and increased locality), and less serialization/
> deserialization time.
>
> The downside is that the C function passed to BDB which is used to
> compare two strings so that the BTree is ordered does not have
> access to the persistent table. Thus we can't order strings
> according to their characters, but only according to their ID which
> means a random order. Of course symbols will be identical to
> themselves and so will be grouped together in duplicate indices.
>
> Sorting according to the characters of the symbol may be possible,
> but there are a number of implications that require some thinking
> about and I want to put this off for now.
>
> As this is a user configurable option (my-config.sexp) and will be
> enabled by default I don't think there is any harm in promoting this.
>
> Comments?
>
> Thank you,
> Ian
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list