[elephant-devel] Re: Derived Indicies

Thu May 8 15:32:35 UTC 2008

> (defpclass person ()
>    ((name ...)
>     (inbox :accessor inbox :initform (make-indexed-btree message)
> 	:index-on (date sender))))
>
> Which would create a indexed-btree that stored messages and auto-
> created indices on the date and sender slots?
>
> That is a bit more OODB/Lispy than the association mechanism I've
> implemented.  However I don't think you need to overload defpclass to
> get this functionality.  In fact I think it starts to get too ugly.

I was thinking of this, but the other way round; MESSAGE should contain
the usual index declarations:

(defpclass message ()
  ((date :index t) ; create an index on DATE
   (sender :index t))) ; create an index on SENDER

These indices would be created in the correct "indexing namespace"
when the object is put into a container. This is easy to maintain
with, say, an INDEXED-SET class. Whatever.

This should probably also imply that objects of the MESSAGE class
are not automatically registered in the store controller's class
root (as it is now); this should only happen on

(defpclass message ()
  (...)
(:index t))

> inbox should be an object (class instance or btree) that has its own
> api.

In fact that's my current solution but I don't like it much. It's not
good for quick development. You need to think too much about the
storage part.

> However associations, like psets, are not sorted (dup-btree
> oid:instance-ref).  The value of (user message) is a persistent object
> that is added to a dup-btree maintained by the metaclass protocol.  It
> maps the oid of a user to the messages that store it.

All too complicated. IMHO a great feature of Elephant is that it let
you work with your objects without worrying much about the storage
backend. As of now, this feature still needs to get better.
Sure, it works (at least for me), but it's not as nice as it could
be. Elephant should take care of all the low-level sorting stuff
(probably creating indices wherever needed or even sorting without
indices for prototyping).

> It would be reasonably cheap to do query caching of these sorted
> OIDs so that subsequent OFFSET & LIMIT style accesses over the same
> query set would be fast, just instantiating those messages that are needed.

While I'm at it: OFFSET and LIMIT (a real limit which lets you specify
an arbitrary Lisp expression) are things we definitely want to aim
for in 1.0. They are not difficult to implement at all, but they don't
work with GET-INSTANCES-BY-* and, worse, MAP-BTREE. This means
everyone has to write their own version of these functions that
take appropriate arguments and move the cursor around themselves
instead of relying on a simple high-level API.

I'd have implemented these extensions myself, but I thought it better
to wait for the integration of the query language to add it.

> The derived index hack is still more efficient for large sets.
> Without changes to the data stores to create an efficient way of
> sorting concatenated values, I don't see a way to improve on it easily.

I'm not sure you actually need concatenated index values at all
if you manage your objects correctly. I.e. putting them in appropriate
containers (the natural OODB way) as opposed to throwing them all
together in some indexing namespace and then tediously (for programmer
and machine) selecting the stuff you need.

  Leslie