<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.3.2">
</HEAD>
<BODY>
Dear Ian,<BR>
Wow. It sounds like you are indeed deeply in to it. I am the maintainer, and not a primary <BR>
author; in some ways I do not even consider myself qualified to answer. I will happily accept<BR>
and your code, as long as the test are green and are you are expanding the possibilities for<BR>
the users, rather than constraining them. And documentation would be nice.<BR>
As to your technical questions, about the approaches to implementing the indexing, in my<BR>
experience indexing is one of the most fluid aspects of an object space; it is a tunable issue<BR>
that tends to change as usage pattern changes even more than the underlying data structure.<BR>
So from my point of view I would much prefer macros or simple functions that let you easily<BR>
add, and discard, indexes. Of course, your observation that one does typically index a slot<BR>
is quite correct, and it would be very convenient if such an index could be added and dropped<BR>
very easily; if one could also implement boolean operators based on those indexes, so much<BR>
the better. As you state, one should be able to somehow introspect the class and discover<BR>
the existing indexing structure; I'm not sure of the best way to do that.<BR>
Although I am a "Smug Lisp Weanie" and fairly well educated, I have to admit that all<BR>
of this direct hacking of CLOS and MOP makes my head spin a little. That is the stuff that<BR>
creates the most problems in trying to work with multiple platforms. We seem now to <BR>
work with ACL and SBCL, and Andrew Blumberg is helping me solve and OpenMCL problem<BR>
right now. As a maintainer, I'm a little loathe to recommend even more complexity in that part.<BR>
However, I always believe interfaces are more important than implementations. The real<BR>
question is: are you conveniently (elegantly?) expanding the API to make elephant more useful?<BR>
Whether this is done with a simple function or metaclass keywords might actually be a side <BR>
issue. The code that will utilize the slot-based indexes is more important than the code that<BR>
will create them.<BR>
If I have a right to insist on anything, I will insist that the indexes be easily droppable.<BR>
<BR>
On Mon, 2006-01-23 at 23:09 -0500, Ian Eslick wrote:
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">While diving into the elephant code to understand it better I started to </FONT>
<FONT COLOR="#000000">think about my normal usage model and that one common model is to lookup </FONT>
<FONT COLOR="#000000">objects by slot value or a range of slot values. This seems like a very </FONT>
<FONT COLOR="#000000">common operation and that adding an initarg ':indexed' to the metaclass </FONT>
<FONT COLOR="#000000">would allow for some simple default functionality:</FONT>
<FONT COLOR="#000000">low-level interface:</FONT>
<FONT COLOR="#000000">- define cursors over persistent-class slots as well as btrees and </FONT>
<FONT COLOR="#000000">secondary indices</FONT>
<FONT COLOR="#000000">- make it easy to iterate over duplicate class+slot and class+slot+value </FONT>
<FONT COLOR="#000000">keys</FONT>
<FONT COLOR="#000000">- we get an index of every persistent-object of a given class if we </FONT>
<FONT COLOR="#000000">implement</FONT>
<FONT COLOR="#000000"> the right comparison operation.</FONT>
<FONT COLOR="#000000">mid-level interface:</FONT>
<FONT COLOR="#000000">- grab sets of objects based on slot-name and slot-value or range of </FONT>
<FONT COLOR="#000000">slot values</FONT>
<FONT COLOR="#000000">high-level interface:</FONT>
<FONT COLOR="#000000">- a simple constraint language with boolean combinators that selects </FONT>
<FONT COLOR="#000000">instances</FONT>
<FONT COLOR="#000000"> based on various combinations of slot ranges or values</FONT>
<FONT COLOR="#000000">- it becomes easier to compile constraints when the class contains </FONT>
<FONT COLOR="#000000">information</FONT>
<FONT COLOR="#000000"> directly that tells you what indexes exist so you can do optimize the </FONT>
<FONT COLOR="#000000">query ahead</FONT>
<FONT COLOR="#000000"> of time.</FONT>
<FONT COLOR="#000000">Supporting this requires adding an additional around method to (setf </FONT>
<FONT COLOR="#000000">slot-value-using-class) on</FONT>
<FONT COLOR="#000000">persistent-slots to specialize on indexed slots and update the slot </FONT>
<FONT COLOR="#000000">index and then potentially</FONT>
<FONT COLOR="#000000">adding an additional layer of cursor operators. This is optional </FONT>
<FONT COLOR="#000000">functionality that will only slow down write, not read, operations and </FONT>
<FONT COLOR="#000000">will be backwards compatible. It should be easy to add SQL support. </FONT>
<FONT COLOR="#000000">The benefit will be to add some nice default behavior that makes the </FONT>
<FONT COLOR="#000000">database aspect of the low-level interfaces much more directly </FONT>
<FONT COLOR="#000000">accessible to new users. </FONT>
<FONT COLOR="#000000">On my local copy I've implemented the metaclass support, overloading and </FONT>
<FONT COLOR="#000000">a good chunk of the constraint language and still pass all of the </FONT>
<FONT COLOR="#000000">current tests. I think I understand the problem well enough now to </FONT>
<FONT COLOR="#000000">query the user community for advice and buy-in. I have yet to support </FONT>
<FONT COLOR="#000000">all the unpleasant details related to changing classes, but the </FONT>
<FONT COLOR="#000000">implications of dropping or adding an indexed slot is rather </FONT>
<FONT COLOR="#000000">straightforward so I think that finishing the implementation and writing </FONT>
<FONT COLOR="#000000">the appropriate tests isn't too much work.</FONT>
<FONT COLOR="#000000">The first question is whether the primary developers and users are open </FONT>
<FONT COLOR="#000000">to the addition of this feature.</FONT>
<FONT COLOR="#000000">If so, the big design question I'm facing at present is:</FONT>
<FONT COLOR="#000000">1) Reuse the current btree infrastructure to create a btree for each </FONT>
<FONT COLOR="#000000">class that maps oids to persistent-objects and instantiate a secondary </FONT>
<FONT COLOR="#000000">index for each indexed slot using the slot accessor functions. This is </FONT>
<FONT COLOR="#000000">the easisest to implement, but might provide somewhat poor performance </FONT>
<FONT COLOR="#000000">on create & writes.</FONT>
<FONT COLOR="#000000">2) Create another underlying DB with string keys </FONT>
<FONT COLOR="#000000">"class-name+slot-name+value" => "oid"? </FONT>
<FONT COLOR="#000000">2a) - Is it better to point to oid's or directly to serialized </FONT>
<FONT COLOR="#000000">persistent-objects? The nice thing about oid's is that later I can </FONT>
<FONT COLOR="#000000">implement join-like operations in the query language using oids without </FONT>
<FONT COLOR="#000000">having to deserialize and cache persistent objects. Persistent-objects </FONT>
<FONT COLOR="#000000">are perhaps more convenient for direct use, however.</FONT>
<FONT COLOR="#000000">Comments would be greatly appreciated. I especially invite debate if </FONT>
<FONT COLOR="#000000">others feel this is the wrong level of abstraction to work at (i.e. </FONT>
<FONT COLOR="#000000">instead write a new def macro for indexed classes and a related protocol </FONT>
<FONT COLOR="#000000">that accomplishes the same result by reusing primary and secondary </FONT>
<FONT COLOR="#000000">btrees). The proposal above seems in good taste to me and I've already </FONT>
<FONT COLOR="#000000">invested some quality time in it, but since I'll be touching a fair bit </FONT>
<FONT COLOR="#000000">of the system to put this in I want to make sure there is support.</FONT>
<FONT COLOR="#000000">Ian</FONT>
<FONT COLOR="#000000">_______________________________________________</FONT>
<FONT COLOR="#000000">elephant-devel site list</FONT>
<FONT COLOR="#000000"><A HREF="mailto:elephant-devel@common-lisp.net">elephant-devel@common-lisp.net</A></FONT>
<FONT COLOR="#000000"><A HREF="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</A></FONT>
</PRE>
</BLOCKQUOTE>
<TABLE CELLSPACING="0" CELLPADDING="0" WIDTH="100%">
<TR>
<TD>
----<BR>
Robert L. Read, PhD read &T robertlread.net<BR>
Consider visiting Progressive Engineering: http://robertlread.net/pe<BR>
In Austin: 912-8593 "Think globally, Act locally." -- RBF<BR>
<BR>
<BR>
</TD>
</TR>
</TABLE>
</BODY>
</HTML>