[elephant-devel] BTREE Sorting on Symbol Strings?

Ian Eslick eslick at csail.mit.edu
Mon Jan 22 01:55:42 UTC 2007


I should have said disabled by default...

On Jan 21, 2007, at 5:21 PM, Ian Eslick wrote:

> How often do users of Elephant rely on BTrees where the symbols are  
> ordered according to the symbol's name?
>
> Would it be terribly inconvenient to you if you had to convert  
> symbol keys to strings to get an alphabetical ordering, but were  
> still assured of contiguity of identical symbol keys in secondary  
> btrees?
>
> The argument is that we serialize symbols to strings all the time  
> (slot access, etc) and this engenders a great deal of overhead.   
> Most symbol serialization is highly redundant and can be factored  
> out by assigning a persistent ID to each symbol as we do with  
> persistent objects.  This results in significantly less disk space  
> (a constant vs. N*char_width bits for every slot value), reduced IO  
> bandwidth (and increased locality), and less serialization/ 
> deserialization time.
>
> The downside is that the C function passed to BDB which is used to  
> compare two strings so that the BTree is ordered does not have  
> access to the persistent table.  Thus we can't order strings  
> according to their characters, but only according to their ID which  
> means a random order.  Of course symbols will be identical to  
> themselves and so will be grouped together in duplicate indices.
>
> Sorting according to the characters of the symbol may be possible,  
> but there are a number of implications that require some thinking  
> about and I want to put this off for now.
>
> As this is a user configurable option (my-config.sexp) and will be  
> enabled by default I don't think there is any harm in promoting this.
>
> Comments?
>
> Thank you,
> Ian
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list