[elephant-devel] Schema evolution
Robert L. Read
read at robertlread.net
Wed Oct 24 15:16:46 UTC 2007
This is a very complex subject.
In the greatest generality, one needs a function to go from one schema
to the next; for example, if you change the type or encoding of a slot,
one must provide a translation function for the slot.
I personally, in the style in which I am working, would be most
comfortable with a function that I could invoke manually to walk the
entire DB, updating where necessary.
The other solutions, although potentially more elegant, seem like a lot
more work.
On Mon, 2007-10-22 at 15:48 -0400, Ian Eslick wrote:
> Another detail to iron down is the implications of change-class and
> redefining a class via defclass.
>
> change-class is pretty easy as it is an explicit call by the user to
> change a given instance. I added a warning mechanism that signals if
> you are going to delete data from a store by dropping a persist slot
> from the instance. This is immediate.
>
> Redefining a class via defclass, thus initiating calls to change-
> instance-for-redefined-class is harder because it is lazy in some (or
> all) lisps. When a defclass causes a change in a standard class
> schema, the instances of that class are updated at latest when an
> object slot is next accessed. update-instance-for-redefined class
> can be overloaded by the user for any given class.
>
> In standard lisp, there is a problem that if you redefine the class
> twice and haven't touched the object in the meantime, you will have a
> different transformed state for each object and only some of them
> will have had change-instance-for-redefined-class called on them. At
> least this is empirically true under Allegro.
>
> However, if you do this sort of things with persistent slots, then
> you have storage leaks in your DB due to slot values not being
> reclaimed on an intermediary change.
>
> i.e.
>
> (defclass test () (slot1 slot2))
> (make-instance 'test :slot1 1 :slot2 2)
> (defclass test () (slot1 (slot3 :initform 10))
> (defclass test () (slot1 slot4))
>
> An instance of this class with values in slot1 and slot2 that is
> loaded after the second definition will cause the value of slot2 to
> be lost. Slot3 will never have been written and slot4 will be empty.
>
>
> It gets worse. If you disconnect from your db without touching all
> the objects in it, then when you restart the system won't remember to
> change any instances of the redefined class when they are loaded, so
> you'll have objects with the old definition; any initforms for new
> class slots won't have been called (will be unbound) and the storage
> associated with any dropped slots will be retained but inaccessible.
>
> So we can do a couple of things about this:
> 1) The "lisp way" here is to allow the users to shoot themselves in
> the foot by giving them the power to control this process via
> explicit touching of objects to properly update after a class change
>
> 2) Automatically walk INDEXED classes only, updating instances by
> pulling them into memory
>
> 3) Provide a function they can call, make-persistent-instances-
> obsolete, which invokes the update behavior on INDEXED classes only.
>
> 4) Do a deep walk the entire DB to update classes (either
> automatically or via a function)
>
> Automatic behaviors can be put into defpclass or made available as
> functions. Walking the entire DB can be VERY expensive, but I think
> it could be done in an online fashion as any instances read by other
> threads will automatically be updated in parallel. We would have to
> catch any new changes to the class and inhibit them until the prior
> update was complete. A similar strategy would work for indexed
> classes, but be much more efficient since all instances would be
> directly accessible via the class index.
>
>
> A persistently lazy method would be messier, but perhaps a better all
> around solution. In this case, for any persistent objects that are
> redefined causing slots to be added or deleted, we store a schema-
> change record in the DB and maintain a schema ID for each instance.
> Then, when we pull a persistent instance out of the db, we can walk
> the list of prior changes between its version and the most current
> version and properly update it.
>
> There are still some problems with this. If we update a class and
> are not connected to a DB, then the schema change will not be
> recorded. Multiple stores containing instances of the same class
> will not necessarily be synchronized.
>
>
> I don't see a good way, other than the #1 above. We inform the
> users, provide some utility functions and illustrate best practices
> (one data store per class, always update manually after class redef)
> to avoid getting shot in the foot. However, I wanted to throw this
> out in case people had a better policy idea.
>
> Regards,
> Ian
>
>
>
>
>
>
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list