[elephant-devel] Re: Derived Indicies

Ian Eslick eslick at media.mit.edu
Thu May 8 14:28:45 UTC 2008


On May 8, 2008, at 6:14 AM, Leslie P. Polzer wrote:

>
> Wouldn't it make sense to generalize the class index mechanism?

class indexing is basically a specialization ofbtrees.  It's pretty  
easy to build your own indexes for special purposes and to store them  
wherever you want.  Common idioms can be optimized via macros

> At the moment we have one big global name space for indexed objects.
> Storing everything there is hellishly expensive for many purposes.
>
> Example:
>
>  (intersection
>    (get-instances-by-value 'message 'owner "Charlie")
>    (get-instances-by-value 'message 'folder 'inbox)
>    ...)
>

> has unholy complexity and is therefore unsuited to numbers
> as low as 1k objects.

You can always try Alex's approach, create a derived index on 'message  
which
computes and sorts on: (format nil "~A::~A" owner folder).  Postmodern  
doesn't
sort on aggregates like BDB can or you could do (cons owner folder)  
instead.

> It would be nice to store Charlie's inbox right in Charlie's
> user object, making it a btree with indices defined by the
> user in DEFPCLASS.

You mean something like:

(defpclass person ()
   ((name ...)
    (inbox :accessor inbox :initform (make-indexed-btree message)
	:index-on (date sender))))

Which would create a indexed-btree that stored messages and auto- 
created indices on the date and sender slots?

That is a bit more OODB/Lispy than the association mechanism I've  
implemented.  However I don't think you need to overload defpclass to  
get this functionality.  In fact I think it starts to get too ugly.   
inbox should be an object (class instance or btree) that has its own  
api.

(defmethod initialize-instance :after ((p person) &rest args)
    (setf (inbox person) (make-inbox)))

(defun make-inbox ()
   (let ((btree (make-indexed-btree)))
     (add-index ...)))

(defun get-messages-by-date (user start end)
    ...)

(add-message (inbox charlie) message)

---------------

The new association mechanism is basically a simple btree version of  
this, but the btree is implicit.

(defpclass user ()
   ((name :accessor name)
    (index :accessor index :associate (message user))))

(setf charlie (make-instance 'user))

(defpclass message ()
   ((user :accessor user :associate user)))

(make-instance 'message charlie)
(make-instance 'message charlie)

(inbox charlie) => returns two messages with charlie in the user slot

However associations, like psets, are not sorted (dup-btree  
oid:instance-ref).  The value of (user message) is a persistent object  
that is added to a dup-btree maintained by the metaclass protocol.  It  
maps the oid of a user to the messages that store it.

I couldn't find a clean way of making the indexed-slot mechanism work  
with associations, which was the reason for moving to dup-btrees.

My thought was that the inbox function would be implemented either by  
a scan of the subset of messages indicated by the inbox association,  
or via a cheap join that intersects two arrays of oids, one from the  
association, one from the appropriate index so the result is a sorted  
list of oids that can be instantiated as needed.  It would be  
reasonably cheap to do query caching of these sorted OIDs so that  
subsequent OFFSET & LIMIT style accesses over the same query set would  
be fast, just instantiating those messages that are needed.

I haven't worked through the interface ideas, but this is the general  
idea that was, in part, driving the association mechanism.

The derived index hack is still more efficient for large sets.   
Without changes to the data stores to create an efficient way of  
sorting concatenated values, I don't see a way to improve on it easily.


> I wonder if it would be a lot of effort to generalize
> GET-INSTANCES-BY... so it takes a class root parameter, and
> to manage a multitude of indexed class btrees.

A more general query language is probably the right solution for this  
interface.  The query language would know about associations, derived  
indices, etc and perform query planning via introspection over the  
class objects.

>  Leslie
>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list