[elephant-devel] Class definition vs. store schema conflicts

Wed Mar 5 03:19:06 UTC 2008

On Tue, 2008-03-04 at 17:18 -0500, Ian Eslick wrote:
> My fellow Elephants,
> 
> I've recently added support for a full schema evolution infrastructure  
> on my local development branch.  Every persistent class now has a  
> schema object associated with it and each store that has one or more  
> instances of that class has a corresponding database-specific schema  
> that includes, among other things, a unique schema id for that class  
> version and that store.  The schema in the database is sufficient to  
> reproduce a basic defclass form (slot names and types, not accessors,  
> initargs, initforms).

Congratulations!  Well done!
> 
> The prior version of class indexing had a sophisticated mechanism for  
> synchronizing between the in-memory class definition and the in-store  
> index list.  This doesn't seem to be used, and is too complicated to  
> be useful so I have nixed it.  For the new schema-based notion of  
> synchronization, I have made some simplifying assumptions:
> 
> MASTER SCHEMA:
>     The in-memory class definition is always the master schema for all  
> open stores that contain instances
>     of that class.
> 
> A class redefinition or connecting to a store with a stale schema may  
> mean that the master and the store schemas are now different.  This  
> means we need to upgrade the store schema and potentially store  
> instances to the new master schema.
> 
> IN-PLACE EVOLUTION:
>     If only the indexed slots differ, then we simply add/delete  
> indices to accommodate.
>     This means that we have to keep track of the hierarchy so we can  
> remove subclasses,
>     or merge a set of subclass schemas if we move the index to a base  
> class.  In-place
>     is fine because no data becomes irrecoverable and all changes are  
> at the class level.
> 
> FULL SCHEMA EVOLUTION:
>     We have add/deleted slots or changed the type of a slot  
> (persistent->transient)
>     - We compute a diff function that adds/deletes the slot storage  
> from the store
>     - Any in-memory instances are upgraded
>     - Store instances are upgraded:
>       1) A scan function is provided to upgrade all instances in the  
> store and delete the old schema versions
>       2) Instances of prior schema versions still in the store can be  
> lazily upgraded on load
> 
> SOME DESIGN CHOICES:
> 
> - Do we delete data (dropped slots, indices) by default during a  
> schema evolution, or do we keep it around just in case?

Personally, I think a new slot can be added safely, but to drop one
requires a restart.  That is, upon reattaching to the store or executing
defclass so that a slot is invalidated, the user is presented with a
choice of aborting or dropping the slot.  I think type changes must be
handled in the same way, although possible a type change could be done
by unbinding each slot value and invoking initform, if there is one; but
this should again be a conscious user decision.

> 
> - What kind of warning/error conditions do we want to provide and when?
>    - When we load a new class def, connect to a store, and need to  
> make schema changes on the store?
>    - Class redefinition with one or more open stores
>    - etc...

Adding is silent, because no information is loss.  If there is a danger
of losing information, the user must either offer a restart or have
explicitly configured that away somehow.

> 
> - How do we enable users to specify an upgrade function to move from  
> schema to schema; do we provide a way to specify a schema version and  
> an upgrade function and allow non-specified versions to just upgrade  
> automatically?

The ultimate case is chaining.  The user provides an upgrade function
from version N to version N+1.  These must all exist from N (where are
now) to K (where we wish to be) in order to start upgrading.  However,
rules for defaulting can mean that one rarely has to specify all of
this.

This may seem over-blown, and can be conisidered an enhancement; but it
makes sense if one imagines schema evolution as software development
released to a third party.  For example, I charge you $10K dollars and
give you a schema that backs a Content Management System.  While you
fill up your schema with content, I create 30 schema improvements.
Finally, my new system passes QA, and I offer you the upgrade of your
Content Management System -- with the function that evolves you through
30 evolutions.
> 
> - What happens when two lisp images are attached to the same store and  
> one updates its class definition?  (Maybe you just get what you  
> deserve?)
You get what you deserve.  In Elephant, the in-memory store is the
system of record.
> 
> - What happens if a store has a schema for a class for which there is  
> no in-memory class object?
You are alerted that the class is going to be deleted unless you specify
otherwise.
> 
> - Do we want the database schema to store initforms, initargs, and  
> accessor/reader/writer names?

Not doing so, or doing so, is a matter of efficiency; the important
thing is the consistency of the data after you detach from a store and
reattach (or, alternatively, when there is some database failure of some
kind.)

> 
> - Given class schemas we could probably add persistent class slots...
> 
> Thanks,
> Ian
> 
> 
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel