[elephant-devel] Representational Question
Ian Eslick
eslick at media.mit.edu
Sat Mar 1 00:03:10 UTC 2008
On Feb 29, 2008, at 6:50 PM, lists at infoway.net wrote:
> Hi Ian,
>
> Thanks for the prompt response. I know the querying facility is not
> necessarily a priority at this time, but will someday become a
> reality :)
>
> To tell you the truth, we haven't really had any direct experience
> with Elephant in production or larger-scale type projects. However,
> we do feel that the whole concept of object prevalence given the
> complexity of the overall data model would make Elephant a more
> appropriate framework than continuing the relational path (maybe
> we're just wrong and Elephant is not best suited for this at all).
> As it is, we currently need to do a lot of work to maintain all the
> data relations and integrity in the current system and hopefully
> working only with the object models would make things easier and
> more "maintainable". Granted, I agree that at this moment, it's a
> lot easier to formulate those queries in SQL, but I'd like to at
> least be able to setup a parallel model and migrate data over so we
> could compare performance (we're not even going to talk about the
> complexity/difficulty of querying in Elephant, since we know that at
> this stage, it is much more complex than SQL queries).
>
> I hope I'm not wrong, but definitely your opinion is worth more
> since you (et al) know a lot more about this than us.
Well, the first thing that occurs to me is search. I don't know what
the performance would be, but you might just try a search-style
algorithm.
Use map-index: and for each element, just chase the dependency chain.
(accessor (accessor (accessor object))) If you have a multi-valued
slot, you'll have to expand the search for all children. There may be
issues with this, not least of all of which is performance, but it's a
good place to start. It would be interesting to compare the
performance, look at different approaches and compare that to a pure
SQL solution.
> As for the second question, the answer is no. The objects would not
> be stored in bulk. The idea is to keep an audit log of user-
> initiated changes on individual entities (e.g. changing a Person's
> address, or correcting a name, or assigning a health insurance plan,
> etc).
Hmmm...One way to do this is to use :after methods on the slot
accessors you want to log. Those :after methods will write a log
entry into the database and the log and the change will all be
committed at the same time. You can arrange this so that all writes
within a transaction gets written at once.
You could also use :after methods on initialize-instance to catch
object creation, etc.
I haven't thought this through fully, but that's one way to do it.
> Thanks,
> Waldo
>
> On Feb 29, 2008, at 3:36 PM, Ian Eslick wrote:
>
>> Hi Waldo,
>>
>> Why do you want to migrate to Elephant for production and not stick
>> with something like CL-SQL or cl-perec on top of a relational
>> database so you get all the facilities that you're familiar with?
>>
>> Also, please don't expect a query system anytime soon. Finishing
>> it is not in my critical path right now and no one else has stepped
>> up and volunteered to lead or help with it.
>>
>> As for your query problem, I think the SQL solution for queries
>> like that is likely to be faster in the end than putting this into
>> Elephant. Elephant is not intended or designed to support
>> efficient relational operations. That's what relational DBs are
>> for! :)
>>
>> Wait until the next big update to elephant before you go too far
>> down this road, I'm hoping that some new features I'm planning at
>> least make this a little bit easier.
>>
>> For your second question, if you are going to save/store the
>> objects in bulk, you can just use standard classes. Then you can
>> have a transaction to fetch/diff/write the composite object to
>> ensure atomicity of updates. This diff would also produce your
>> log. However that means that you lose the indexing capability of
>> persistent objects.
>>
>> Ian
>>
>> On Feb 29, 2008, at 2:47 PM, lists at infoway.net wrote:
>>
>>> Hi all,
>>>
>>> As I'm further exploring more and more things to do in Elephant
>>> and Lisp, I think we're ready to start migrating some of our RoR
>>> apps over, if not just as an exercise, we'll someday migrate them
>>> to production.
>>>
>>> Since we all have a very strong and hard-headed background on
>>> MySQL and relational models, it's been extremely difficult for us
>>> to migrate away from that mentality and think of objects and some
>>> of Elephant's terminology such as class indexes, which kind of
>>> confuse us into thinking that a class index allows us to look at a
>>> set of objects in a similar way as a MySQL table.
>>>
>>> I've read and seen in the src the beginning efforts to building a
>>> query system into Elephant. That would be great and as our efforts
>>> approach that phase, we hope to contribute to it.
>>>
>>> So, in this email, first I will ask for advise as to how to best
>>> represent the structure of our objects/classes and indices in
>>> Elephant in order to ultimately be able to query the data. Again,
>>> I'm not going to ask for the querying strategy (just yet) but
>>> ultimately, we will need to be able to answer queries like this.
>>> Obviously I don't expect anyone to give me the full representation
>>> of this, but any advise/hints as to best represent them will help
>>> greatly.
>>>
>>> We have a database with many related tables. For simplicity
>>> purposes, we'll describe a simplified scenario. We have a table
>>> with people information (e.g. first,last names, date of birth, and
>>> gender). We have a linked table with each person's addresses
>>> (multiple addresses in case they moved. Each address is
>>> timestamped so the most recent address is the current address).
>>> Then, each person may be subscribed to one or more health
>>> insurance plans, and so there is a table linking each person to
>>> one or more health insurance plans (and a table that defines the
>>> health insurance plans)
>>>
>>> Now, each person may select up to N preferred medical offices
>>> where they would like to receive treatment. Again, there is a
>>> table that links the person with one or more medical office.
>>> Needless to say, there is a table of medical offices. Each medical
>>> office is also linked to a timestamped address table, where the
>>> most recent address is the current one (in the event the office
>>> moves). To further expand on the issue, each office has one or
>>> more doctors rendering services, so there is a table that links
>>> the offices to the doctors, and of course, there is a table of
>>> doctors that contains basic information, such as fname, lname, and
>>> gender. Last, but not least, a doctor may be specialized in
>>> multiple areas, so there is a table that links doctors to all the
>>> specialties they have been certified on, and thus there is yet
>>> another table that lists all possible specialties.
>>>
>>> Now, assuming I was able to explain the scenario correctly, we
>>> then have users asking the system for information such as:
>>>
>>> "List all people (subscribers), who are male and live in zip code
>>> 33012 who are contracted under Health Insurance Plan A that have
>>> selected (as their preferred medical office) medical offices with
>>> male cardiologists that work within 10 miles of 33012 zip code or
>>> in MIAMI-DADE county and whose office names contain the sequence
>>> of letters 'HEAL'"
>>>
>>> The way we see it, the concept of tables disappears and so do the
>>> tables that provide many-to-many joins. So, we end up with some
>>> classes such as "Person" which contains a reference to a list of
>>> "Address" objects, and a list of preferred "Medical-Office"
>>> objects, where each Medical-Office object has a list of Doctor
>>> objects and each Doctor has a list of Specialty objects, etc, etc.
>>>
>>> Now, we assume that each of these classes will need to maintain
>>> multiple indices, such as the Person class being index on first
>>> name, last name, dob, gender, among others. The Address class
>>> indexed on zip code, county name, among others, and so on and so
>>> forth.
>>>
>>> The querying is one problem. The data representation is another.
>>> We think it's clear that we should have, as an example, a Person
>>> class. However, the representation of the links between a Person
>>> and its Addresses or Medical-Offices is not 100% clear. If we
>>> represent them as a slot in the Person class, where this slot
>>> would be a List or a set of references to the Address class, then
>>> in order for us to query on those, means that we always need to
>>> fetch all objects in those slots in order to apply any search
>>> criteria, which seems like a bottleneck. If that was the solution,
>>> I assume we could implement logic such that Addresses are pushed
>>> into the list, so that the most recent address is in the CAR, so
>>> we wouldn't necessarily need to read the entire list of Addresses
>>> for each member, but just fetch the CAR of the slot.
>>>
>>> Now, onto the second question. One of the other requirements we
>>> have is that we need to keep an audit log of data changes. The way
>>> we do it in RoR is relatively simple. We fetch an object from the
>>> DB and present it on the browser. When the user submits, we fetch
>>> another fresh copy from the DB and if the timestamps are the same
>>> (meaning no one else changed the record) we compare changes to the
>>> object's attributes (slots). If there are any differences, we save
>>> the changes (we're trying to avoid unnecessary trips to the DB)
>>> and if the changes are saved successfully, we write a log of ONLY
>>> the attributes that were changed (which is pretty trivial in Ruby).
>>>
>>> From what we've read in Elephant's manual, this seems harder
>>> because we don't want to work directly off the Elephant object but
>>> a memory copy while the user takes his/her time in the browser and
>>> after submitting, we would take the changes and commit them to the
>>> Elephant object. Makes me think that we would need to classes for
>>> each object (one with and one without the persistent metaclass).
>>> The other problem would be how to "easily" have two objects
>>> introspect themselves and spit out the slots that changed between
>>> the two.
>>>
>>> Are we looking at this incorrectly? Any advise would be greatly
>>> appreciated.
>>>
>>> Thanks,
>>> Waldo
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel at common-lisp.net
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list