[elephant-devel] Add indexed persistent class slots to elephant?
Ian Eslick
eslick at csail.mit.edu
Tue Jan 24 04:09:47 UTC 2006
While diving into the elephant code to understand it better I started to
think about my normal usage model and that one common model is to lookup
objects by slot value or a range of slot values. This seems like a very
common operation and that adding an initarg ':indexed' to the metaclass
would allow for some simple default functionality:
low-level interface:
- define cursors over persistent-class slots as well as btrees and
secondary indices
- make it easy to iterate over duplicate class+slot and class+slot+value
keys
- we get an index of every persistent-object of a given class if we
implement
the right comparison operation.
mid-level interface:
- grab sets of objects based on slot-name and slot-value or range of
slot values
high-level interface:
- a simple constraint language with boolean combinators that selects
instances
based on various combinations of slot ranges or values
- it becomes easier to compile constraints when the class contains
information
directly that tells you what indexes exist so you can do optimize the
query ahead
of time.
Supporting this requires adding an additional around method to (setf
slot-value-using-class) on
persistent-slots to specialize on indexed slots and update the slot
index and then potentially
adding an additional layer of cursor operators. This is optional
functionality that will only slow down write, not read, operations and
will be backwards compatible. It should be easy to add SQL support.
The benefit will be to add some nice default behavior that makes the
database aspect of the low-level interfaces much more directly
accessible to new users.
On my local copy I've implemented the metaclass support, overloading and
a good chunk of the constraint language and still pass all of the
current tests. I think I understand the problem well enough now to
query the user community for advice and buy-in. I have yet to support
all the unpleasant details related to changing classes, but the
implications of dropping or adding an indexed slot is rather
straightforward so I think that finishing the implementation and writing
the appropriate tests isn't too much work.
The first question is whether the primary developers and users are open
to the addition of this feature.
If so, the big design question I'm facing at present is:
1) Reuse the current btree infrastructure to create a btree for each
class that maps oids to persistent-objects and instantiate a secondary
index for each indexed slot using the slot accessor functions. This is
the easisest to implement, but might provide somewhat poor performance
on create & writes.
2) Create another underlying DB with string keys
"class-name+slot-name+value" => "oid"?
2a) - Is it better to point to oid's or directly to serialized
persistent-objects? The nice thing about oid's is that later I can
implement join-like operations in the query language using oids without
having to deserialize and cache persistent objects. Persistent-objects
are perhaps more convenient for direct use, however.
Comments would be greatly appreciated. I especially invite debate if
others feel this is the wrong level of abstraction to work at (i.e.
instead write a new def macro for indexed classes and a related protocol
that accomplishes the same result by reusing primary and secondary
btrees). The proposal above seems in good taste to me and I've already
invested some quality time in it, but since I'll be touching a fair bit
of the system to put this in I want to make sure there is support.
Ian
More information about the elephant-devel
mailing list