[elephant-devel] Re: traversing btree using multiple indices

Ian Eslick eslick at media.mit.edu
Fri Apr 4 13:45:20 UTC 2008


Hi there,

The string tuple-sorting hack makes me cringe.  I know that I could  
make BDB and the lisp side sort lists based on their constituents  
pretty easily if you felt we could make the same hack work in  
postmodern.

(cons "Fred" 23) < (cons "Fred" 25) < (cons "Sally" 10)

Maybe we could create a special tuple object for sorting aggregates so  
we didn't break the 'cannot sort on non-primitives' guarantee today.


As for Sean's request about doing an efficient intersection, as Alex  
was explaining there are only two ways to do this efficiently:

1) Select the smallest set of objects (date range or tag name) and  
iterate over it filtering by the value of the other

2) Select both sets, but only by OID and then merge the OIDs and  
iterate the OID list creating only the objects you need.  (elephant- 
unstable makes this easy and this should show up later).

(If you wanted the OIDs in #2 sorted by date, then you'd have to find  
a way to do an OID merge and maintain the date information so you  
didn't have to resort.  I have ideas on this but it isn't implemented.)


However, premature optimization often causes more trouble than it's  
worth.  The easiest thing to do to get going is to index both slots  
and say:

(intersection (get-instances-by-range 'event 'date <date1> <date2>)
               (get-instances-by-value 'event 'tag <tagname>))

You may be surprised how fast that is!  This will work for thousands  
of objects per set with BDB.

The new unstable branch removes the MOP overhead so if your set sizes  
are in the low 10's of thousands this should take less than a second.

Ian

On Apr 4, 2008, at 8:20 AM, Alex Mizrahi wrote:
> SR> What is the best/easiest/most-elephantish way to retrieve all  
> events
> SR> in the btree which have the 'lisp' tag and whose date falls on  
> today?
>
> so you have lots of possible tags and many tags per event?
> then you need a btree with entry for each event for each tags, that is
> (<tag, date> -> event_id) btree.
> just as with SQL you'd need (tag, date, event_id) table.
> as elephant cannot sort tuples, you need to convert them to strings  
> as i've
> noted before.
>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list