[elephant-devel] Schema evolution

Robert L. Read read at robertlread.net
Wed Oct 24 15:16:46 UTC 2007


This is a very complex subject.

In the greatest generality, one needs a function to go from one schema
to the next; for example, if you change the type or encoding of a slot,
one must provide a translation function for the slot.

I personally, in the style in which I am working, would be most
comfortable with a function that I could invoke manually to walk the
entire DB, updating where necessary.

The other solutions, although potentially more elegant, seem like a lot
more work.



On Mon, 2007-10-22 at 15:48 -0400, Ian Eslick wrote:
> Another detail to iron down is the implications of change-class and  
> redefining a class via defclass.
> 
> change-class is pretty easy as it is an explicit call by the user to  
> change a given instance.  I added a warning mechanism that signals if  
> you are going to delete data from a store by dropping a persist slot  
> from the instance.  This is immediate.
> 
> Redefining a class via defclass, thus initiating calls to change- 
> instance-for-redefined-class is harder because it is lazy in some (or  
> all) lisps.  When a defclass causes a change in a standard class  
> schema, the instances of that class are updated at latest when an  
> object slot is next accessed.  update-instance-for-redefined class  
> can be overloaded by the user for any given class.
> 
> In standard lisp, there is a problem that if you redefine the class  
> twice and haven't touched the object in the meantime, you will have a  
> different transformed state for each object and only some of them  
> will have had change-instance-for-redefined-class called on them.  At  
> least this is empirically true under Allegro.
> 
> However, if you do this sort of things with persistent slots, then  
> you have storage leaks in your DB due to slot values not being  
> reclaimed on an intermediary change.
> 
> i.e.
> 
> (defclass test () (slot1 slot2))
> (make-instance 'test :slot1 1 :slot2 2)
> (defclass test () (slot1 (slot3 :initform 10))
> (defclass test () (slot1 slot4))
> 
> An instance of this class with values in slot1 and slot2 that is  
> loaded after the second definition will cause the value of slot2 to  
> be lost.  Slot3 will never have been written and slot4 will be empty.
> 
> 
> It gets worse.  If you disconnect from your db without touching all  
> the objects in it, then when you restart the system won't remember to  
> change any instances of the redefined class when they are loaded, so  
> you'll have objects with the old definition; any initforms for new  
> class slots won't have been called (will be unbound) and the storage  
> associated with any dropped slots will be retained but inaccessible.
> 
> So we can do a couple of things about this:
> 1) The "lisp way" here is to allow the users to shoot themselves in  
> the foot by giving them the power to control this process via  
> explicit touching of objects to properly update after a class change
> 
> 2) Automatically walk INDEXED classes only, updating instances by  
> pulling them into memory
> 
> 3) Provide a function they can call, make-persistent-instances- 
> obsolete, which invokes the update behavior on INDEXED classes only.
> 
> 4) Do a deep walk the entire DB to update classes (either  
> automatically or via a function)
> 
> Automatic behaviors can be put into defpclass or made available as  
> functions.  Walking the entire DB can be VERY expensive, but I think  
> it could be done in an online fashion as any instances read by other  
> threads will automatically be updated in parallel.  We would have to  
> catch any new changes to the class and inhibit them until the prior  
> update was complete.  A similar strategy would work for indexed  
> classes, but be much more efficient since all instances would be  
> directly accessible via the class index.
> 
> 
> A persistently lazy method would be messier, but perhaps a better all  
> around solution.  In this case, for any persistent objects that are  
> redefined causing slots to be added or deleted, we store a schema- 
> change record in the DB and maintain a schema ID for each instance.   
> Then, when we pull a persistent instance out of the db, we can walk  
> the list of prior changes between its version and the most current  
> version and properly update it.
> 
> There are still some problems with this.  If we update a class and  
> are not connected to a DB, then the schema change will not be  
> recorded.  Multiple stores containing instances of the same class  
> will not necessarily be synchronized.
> 
> 
> I don't see a good way, other than the #1 above.  We inform the  
> users, provide some utility functions and illustrate best practices  
> (one data store per class, always update manually after class redef)  
> to avoid getting shot in the foot.  However, I wanted to throw this  
> out in case people had a better policy idea.
> 
> Regards,
> Ian
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list