[elephant-devel] Newbie question: Index on collections

Ian Eslick eslick at media.mit.edu
Thu Mar 20 01:11:25 UTC 2008


Great!  Actually in writing the following e-mail, I realized that this  
had forced me to think through a bunch of things and so I wrote a  
version of the functionality in associations.lisp and just checked it  
in.

Start playing with, reading and commenting on what's wrong with it.   
For example, I'd like a better API for adding/removing items from  
slots than overloading (setf (accessor instance) related-instance).  I  
left a commented API proposal in the source code for you to look at.

There are still bugs that I haven't thought through that will crop up  
when you change classes, redefine classes, etc.  Let me know what you  
think.

========================

Download the elephant-unstable branch:

darcs get http://www.common-lisp.net/project/elephant/darcs/elephant-unstable

It's probably best to spend a little time reading and experimenting,  
then write up a proposal so we can discuss it a bit.  The support for  
defining association slots (but not implementing them) is already  
implemented in metaclasses.lisp.

The idea of an association is that instances of class A can find  
instances of class B that are 'associated' with it.  Association  
relations can be undirected or directed, typically called many-to-many  
or 1-to-many. 1-to-many is a kind of inverse indexing and many-to-many  
requires a separate btree/class/table to maintain the undirected  
relation.  Hopefully the following description will help clarify what  
I mean.


One-to-Many associations:

If class B has a slot that stores a single instance of A, then class A  
would like to have a method that returns the set of instances of B  
that refer to the instance of A.

If A is person and B is job, then job->owner is a reference to A and A- 
 >jobs is a method (no slot storage) that refers to an index on job- 
 >owner.  The index keys are A oids.

If you look in indexed-slots.lisp, you'll get a sense of how indices  
work.  set-valued-slots.lisp is also useful as it puts a btree in each  
instance's slot to store sets of objects using the pset abstraction in  
pset.lisp.  Reading these three files should help alot to see how we  
use btrees and dup-btrees to maintain sets of objects and relations.


One challenge is how to specify an association implicitly in slot  
definitions.  Ideally you want it to be on the object that you are  
most often going to use to add and remove relations:

(defpclass person ()
   ((jobs :accessor jobs :association (job owner))))

(defpclass job ()
   ((title :accessor title ...)
    (company :accessor company ...)))

But this means that person has to change the definition of job and  
that can be problematic.  So you could require agreement...

(defpclass person ()
   ((jobs :accessor jobs :association (job owner)))

(defpclass job ()
   ((title ...)
    (company ...)
    (owner :accessor owner :association person)))

This means that job has a slot owner who's values are instances of the  
class person.  This slot is also effectively indexed.   A person has a  
method jobs which looks up the index in (job owner) and returns the  
list of instances of job that have the instance of A in the owner slot.


Many-to-Many:

To add many-to-many relations, any A->B relation also requires that we  
can find the relation from the other direction B->A.  Typically this  
requires a special table to keep track of pairs of related objects.   
You can then query on either column to get the set of related objects.

many-to-many association: A<->B
(A1 B1)
(A1 B2)
(A2 B1)
(A2 B3)

(get-related assoc A1)->B1,B2
(get-related assoc B1)->A1,A2

We can simply stick with the index model and just ensure that there  
are indexes on both classes instead of one.  When we add an element to  
one side, we also add it to the other.  (This is what is implemented  
now)


Optimizations/Alternatives:

One optimization for one-to-many is to return a standard object  
implementing the pset protocol that just keeps a reference to the  
index and supports map operations over the appropriate subset of the  
objects in the index.

Regards,
Ian

On Mar 19, 2008, at 2:42 AM, Marc wrote:
> Hi Ian,
>>
>> My best guess is weeks, not months.  We're hoping for April  
>> sometime, but it depends on a number of factors.  Elephant is  
>> getting quite complex to test so the last few bugs can take some  
>> time.
>>
>> You can speed this process up if you want to help develop and test  
>> the association facility (most of the hooks are there so it  
>> shouldn't be too hard, even if you don't fully understand the  
>> elephant internals).  If not, we have plenty of new features that  
>> need tests written against them!
> Since we need this feature anyway for our application, I'm quite  
> prepared to join in the work on the association facility. Any  
> pointer on what would be the best entry point into that task?
>
> Best regards,
>
> Marc
>>
>> On Mar 18, 2008, at 12:50 AM, Marc wrote:
>>
>>> Ian Eslick wrote:
>>>> The pset is probably your best bet.  Other collection objects  
>>>> have more performance impact.  With a pset you can ask:
>>>>
>>>> is X a friend of Y?  (find-item X (friends Y)) => X | nil
>>>>
>>>> all people that X is friends of:
>>>>
>>>> (remove-nulls
>>>> (map-class 'people (lambda (y)
>>>>                       (when (find-item X (friends Y))
>>>>                          Y))
>>>>            :collect t))
>>>>
>>>> This requires walking all person objects.  For larger DB's, you  
>>>> can build your own 'many-to-many' relation table or waiting for  
>>>> the next release which should do this for you.
>>> Is there a time line (however preliminary) when this release might  
>>> be
>>> available?
>>>>
>>>> (defpclass friend-relation ()
>>>>  ((first :accessor first :initarg :first :index t)
>>>>   (second :accessor second :initarg :second :index t)))
>>>>
>>>> (defmethod add-friend-relation (x y)
>>>>  (make-instance 'friend-relation :first x :second y))
>>>>
>>>> (defmethod friends-of (x)
>>>>  (union (mapcar #'second (get-instances-by-slot 'friend-relation  
>>>> 'first x))
>>>>         (mapcar #'first (get-instances-by-slot 'friend-relation  
>>>> 'second x))))
>>>>
>>>> Of course there are ways to do this more efficiently, but I think  
>>>> this is the idea.
>>>>
>>> Many thanks for this guidance. That does, indeed, give me an idea  
>>> of the
>>> best way forward in our application.
>>>
>>>
>>> Best regards,
>>>
>>> Marc
>>>
>>>>
>>>>
>>>> On Mar 16, 2008, at 11:06 PM, Marc wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I fear that this may be a rather stupid newbie question, but  
>>>>> neither from the documentation nor from the mailing list posts I  
>>>>> got an idea as to the best practice for the following type of  
>>>>> searches. Let's use an example akin to the one in the tutorial:
>>>>>
>>>>> (defpclass person ()
>>>>> ((name :accessor name :index t)
>>>>> (friends :accessor friends)) ;collection with names (say as  
>>>>> strings) of the friends of this person
>>>>> )
>>>>>
>>>>> Given a similar structure, what is the recommended way to handle  
>>>>> queries of the type "find all persons that X is a friend  
>>>>> of?" (i.e. queries on individual entries in slots that contain  
>>>>> collections)? What would be the best data structure for the  
>>>>> friends slot, assuming that the collection remains fairly stable  
>>>>> over time (psets, lists or other collections)?
>>>>>
>>>>> Thanks in advance for any hint!
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Marc
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> elephant-devel site list
>>>>> elephant-devel at common-lisp.net
>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>
>>>> _______________________________________________
>>>> elephant-devel site list
>>>> elephant-devel at common-lisp.net
>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel at common-lisp.net
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>>
>>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list