[elephant-devel] Newbie question: Index on collections
Marc
klists at saphor.de
Sat May 3 07:56:15 UTC 2008
Marc wrote:
> Ian Eslick wrote:
>> Great! Actually in writing the following e-mail, I realized that
>> this had forced me to think through a bunch of things and so I wrote
>> a version of the functionality in associations.lisp and just checked
>> it in.
>>
>> Start playing with, reading and commenting on what's wrong with it.
>> For example, I'd like a better API for adding/removing items from
>> slots than overloading (setf (accessor instance) related-instance).
>> I left a commented API proposal in the source code for you to look at.
>>
>> There are still bugs that I haven't thought through that will crop up
>> when you change classes, redefine classes, etc. Let me know what you
>> think.
> Thanks for the file and the documentation in this mail. Starting from
> this I'll now begin to dig into it and try to grope my way around.
>
> As to timing: this week is somewhat busy due to a conference, but I
> still hope for some time in between sessions.
I fear that the delay has been much longer than anticipated --- it has
been very busy ever after the conference ---, but no w I hope finally to
find some time --- provided that the issue is not long obsolete.
Best regards,
Marc
P. S.: I've noticed that there have been few changes in the unstable
branch lately, but many more in the stable one. Which one is the one to
work on for this right now?
>
>>
>> ========================
>>
>> Download the elephant-unstable branch:
>>
>> darcs get
>> http://www.common-lisp.net/project/elephant/darcs/elephant-unstable
>>
>> It's probably best to spend a little time reading and experimenting,
>> then write up a proposal so we can discuss it a bit. The support for
>> defining association slots (but not implementing them) is already
>> implemented in metaclasses.lisp.
>>
>> The idea of an association is that instances of class A can find
>> instances of class B that are 'associated' with it. Association
>> relations can be undirected or directed, typically called
>> many-to-many or 1-to-many. 1-to-many is a kind of inverse indexing
>> and many-to-many requires a separate btree/class/table to maintain
>> the undirected relation. Hopefully the following description will
>> help clarify what I mean.
>>
>>
>> One-to-Many associations:
>>
>> If class B has a slot that stores a single instance of A, then class
>> A would like to have a method that returns the set of instances of B
>> that refer to the instance of A.
>>
>> If A is person and B is job, then job->owner is a reference to A and
>> A->jobs is a method (no slot storage) that refers to an index on
>> job->owner. The index keys are A oids.
>>
>> If you look in indexed-slots.lisp, you'll get a sense of how indices
>> work. set-valued-slots.lisp is also useful as it puts a btree in
>> each instance's slot to store sets of objects using the pset
>> abstraction in pset.lisp. Reading these three files should help alot
>> to see how we use btrees and dup-btrees to maintain sets of objects
>> and relations.
>>
>>
>> One challenge is how to specify an association implicitly in slot
>> definitions. Ideally you want it to be on the object that you are
>> most often going to use to add and remove relations:
>>
>> (defpclass person ()
>> ((jobs :accessor jobs :association (job owner))))
>>
>> (defpclass job ()
>> ((title :accessor title ...)
>> (company :accessor company ...)))
>>
>> But this means that person has to change the definition of job and
>> that can be problematic. So you could require agreement...
>>
>> (defpclass person ()
>> ((jobs :accessor jobs :association (job owner)))
>>
>> (defpclass job ()
>> ((title ...)
>> (company ...)
>> (owner :accessor owner :association person)))
>>
>> This means that job has a slot owner who's values are instances of
>> the class person. This slot is also effectively indexed. A person
>> has a method jobs which looks up the index in (job owner) and returns
>> the list of instances of job that have the instance of A in the owner
>> slot.
>>
>>
>> Many-to-Many:
>>
>> To add many-to-many relations, any A->B relation also requires that
>> we can find the relation from the other direction B->A. Typically
>> this requires a special table to keep track of pairs of related
>> objects. You can then query on either column to get the set of
>> related objects.
>>
>> many-to-many association: A<->B
>> (A1 B1)
>> (A1 B2)
>> (A2 B1)
>> (A2 B3)
>>
>> (get-related assoc A1)->B1,B2
>> (get-related assoc B1)->A1,A2
>>
>> We can simply stick with the index model and just ensure that there
>> are indexes on both classes instead of one. When we add an element
>> to one side, we also add it to the other. (This is what is
>> implemented now)
>>
>>
>> Optimizations/Alternatives:
>>
>> One optimization for one-to-many is to return a standard object
>> implementing the pset protocol that just keeps a reference to the
>> index and supports map operations over the appropriate subset of the
>> objects in the index.
>>
>> Regards,
>> Ian
>>
>> On Mar 19, 2008, at 2:42 AM, Marc wrote:
>>> Hi Ian,
>>>>
>>>> My best guess is weeks, not months. We're hoping for April
>>>> sometime, but it depends on a number of factors. Elephant is
>>>> getting quite complex to test so the last few bugs can take some time.
>>>>
>>>> You can speed this process up if you want to help develop and test
>>>> the association facility (most of the hooks are there so it
>>>> shouldn't be too hard, even if you don't fully understand the
>>>> elephant internals). If not, we have plenty of new features that
>>>> need tests written against them!
>>> Since we need this feature anyway for our application, I'm quite
>>> prepared to join in the work on the association facility. Any
>>> pointer on what would be the best entry point into that task?
>>>
>>> Best regards,
>>>
>>> Marc
>>>>
>>>> On Mar 18, 2008, at 12:50 AM, Marc wrote:
>>>>
>>>>> Ian Eslick wrote:
>>>>>> The pset is probably your best bet. Other collection objects
>>>>>> have more performance impact. With a pset you can ask:
>>>>>>
>>>>>> is X a friend of Y? (find-item X (friends Y)) => X | nil
>>>>>>
>>>>>> all people that X is friends of:
>>>>>>
>>>>>> (remove-nulls
>>>>>> (map-class 'people (lambda (y)
>>>>>> (when (find-item X (friends Y))
>>>>>> Y))
>>>>>> :collect t))
>>>>>>
>>>>>> This requires walking all person objects. For larger DB's, you
>>>>>> can build your own 'many-to-many' relation table or waiting for
>>>>>> the next release which should do this for you.
>>>>> Is there a time line (however preliminary) when this release might be
>>>>> available?
>>>>>>
>>>>>> (defpclass friend-relation ()
>>>>>> ((first :accessor first :initarg :first :index t)
>>>>>> (second :accessor second :initarg :second :index t)))
>>>>>>
>>>>>> (defmethod add-friend-relation (x y)
>>>>>> (make-instance 'friend-relation :first x :second y))
>>>>>>
>>>>>> (defmethod friends-of (x)
>>>>>> (union (mapcar #'second (get-instances-by-slot 'friend-relation
>>>>>> 'first x))
>>>>>> (mapcar #'first (get-instances-by-slot 'friend-relation
>>>>>> 'second x))))
>>>>>>
>>>>>> Of course there are ways to do this more efficiently, but I think
>>>>>> this is the idea.
>>>>>>
>>>>> Many thanks for this guidance. That does, indeed, give me an idea
>>>>> of the
>>>>> best way forward in our application.
>>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Marc
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 16, 2008, at 11:06 PM, Marc wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> I fear that this may be a rather stupid newbie question, but
>>>>>>> neither from the documentation nor from the mailing list posts I
>>>>>>> got an idea as to the best practice for the following type of
>>>>>>> searches. Let's use an example akin to the one in the tutorial:
>>>>>>>
>>>>>>> (defpclass person ()
>>>>>>> ((name :accessor name :index t)
>>>>>>> (friends :accessor friends)) ;collection with names (say as
>>>>>>> strings) of the friends of this person
>>>>>>> )
>>>>>>>
>>>>>>> Given a similar structure, what is the recommended way to handle
>>>>>>> queries of the type "find all persons that X is a friend of?"
>>>>>>> (i.e. queries on individual entries in slots that contain
>>>>>>> collections)? What would be the best data structure for the
>>>>>>> friends slot, assuming that the collection remains fairly stable
>>>>>>> over time (psets, lists or other collections)?
>>>>>>>
>>>>>>> Thanks in advance for any hint!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Marc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> elephant-devel site list
>>>>>>> elephant-devel at common-lisp.net
>>>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> elephant-devel site list
>>>>>> elephant-devel at common-lisp.net
>>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> elephant-devel site list
>>>>> elephant-devel at common-lisp.net
>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>
>>>> _______________________________________________
>>>> elephant-devel site list
>>>> elephant-devel at common-lisp.net
>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel at common-lisp.net
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>>
>>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
>
>
>
More information about the elephant-devel
mailing list