[elephant-devel] Schema evolution
Ian Eslick
eslick at csail.mit.edu
Mon Oct 22 19:48:05 UTC 2007
Another detail to iron down is the implications of change-class and
redefining a class via defclass.
change-class is pretty easy as it is an explicit call by the user to
change a given instance. I added a warning mechanism that signals if
you are going to delete data from a store by dropping a persist slot
from the instance. This is immediate.
Redefining a class via defclass, thus initiating calls to change-
instance-for-redefined-class is harder because it is lazy in some (or
all) lisps. When a defclass causes a change in a standard class
schema, the instances of that class are updated at latest when an
object slot is next accessed. update-instance-for-redefined class
can be overloaded by the user for any given class.
In standard lisp, there is a problem that if you redefine the class
twice and haven't touched the object in the meantime, you will have a
different transformed state for each object and only some of them
will have had change-instance-for-redefined-class called on them. At
least this is empirically true under Allegro.
However, if you do this sort of things with persistent slots, then
you have storage leaks in your DB due to slot values not being
reclaimed on an intermediary change.
i.e.
(defclass test () (slot1 slot2))
(make-instance 'test :slot1 1 :slot2 2)
(defclass test () (slot1 (slot3 :initform 10))
(defclass test () (slot1 slot4))
An instance of this class with values in slot1 and slot2 that is
loaded after the second definition will cause the value of slot2 to
be lost. Slot3 will never have been written and slot4 will be empty.
It gets worse. If you disconnect from your db without touching all
the objects in it, then when you restart the system won't remember to
change any instances of the redefined class when they are loaded, so
you'll have objects with the old definition; any initforms for new
class slots won't have been called (will be unbound) and the storage
associated with any dropped slots will be retained but inaccessible.
So we can do a couple of things about this:
1) The "lisp way" here is to allow the users to shoot themselves in
the foot by giving them the power to control this process via
explicit touching of objects to properly update after a class change
2) Automatically walk INDEXED classes only, updating instances by
pulling them into memory
3) Provide a function they can call, make-persistent-instances-
obsolete, which invokes the update behavior on INDEXED classes only.
4) Do a deep walk the entire DB to update classes (either
automatically or via a function)
Automatic behaviors can be put into defpclass or made available as
functions. Walking the entire DB can be VERY expensive, but I think
it could be done in an online fashion as any instances read by other
threads will automatically be updated in parallel. We would have to
catch any new changes to the class and inhibit them until the prior
update was complete. A similar strategy would work for indexed
classes, but be much more efficient since all instances would be
directly accessible via the class index.
A persistently lazy method would be messier, but perhaps a better all
around solution. In this case, for any persistent objects that are
redefined causing slots to be added or deleted, we store a schema-
change record in the DB and maintain a schema ID for each instance.
Then, when we pull a persistent instance out of the db, we can walk
the list of prior changes between its version and the most current
version and properly update it.
There are still some problems with this. If we update a class and
are not connected to a DB, then the schema change will not be
recorded. Multiple stores containing instances of the same class
will not necessarily be synchronized.
I don't see a good way, other than the #1 above. We inform the
users, provide some utility functions and illustrate best practices
(one data store per class, always update manually after class redef)
to avoid getting shot in the foot. However, I wanted to throw this
out in case people had a better policy idea.
Regards,
Ian
More information about the elephant-devel
mailing list