[elephant-devel] Object caching

Thu Jun 12 03:32:57 UTC 2008

Obviously, I am biased, not by self-conceit but by familiarity with DCM.

I think something like DCM, in which you use a write-through in-memory
cache, is very workable.  To my mind, an ideal system would allow cache
policy to be set separately from the concurrency control and elephant
usage.  To me, a simple hash-table is the starting point for a cache,
but you almost always want to be able to control the size of the cache,
and that decision will typically vary with the particular persistent
class in question.

DCM provides an understandable example
, and even provides generational strategies.

However, DCM uses btrees directly; I think you would want something that
would work with the persistent class machinery that you wrote.

Whether these things would be extensions of persistent-class or more
loosely coupled outside of them, I'm not sure.

For example, if we are building a reusable caching system that really
could be independent of Elephant, that would have value independently,
just as a serializer for lisp objects is value independent of Elephant
(though of course Elephant must use one.)

On Wed, 2008-06-11 at 12:35 -0400, Ian Eslick wrote:
> I'm dissatisfied with the current approach to object caching in  
> elephant-unstable.
> 
> Issues:
> - Not thread safe
> - Break most transactional guarantees
> - Adds per-instance slot overhead to maintain the cache mode
> - Incompatible with indexing API
> - No standard usage model with well understood implications
> 
> This leads me to some strong statements about the features:
> 
> - The :cache option simply reserves space during instance allocation  
> to cache values.
> - Cached objects can never be considered thread safe or transactional  
> (isolated/atomic)
> - Directly setting the cache mode of an object is an advanced feature  
> intended to be used by a higher level API
> 
> I think that something lightweight along the lines of Robert's DCM or  
> my snapshot-set model might provide guidance on a higher level model  
> that exploits caching.  (i.e. caching is used in some larger context  
> like a 'with-cached-objects' macro or a check-in/check-out protocol  
> with some guard object as in DCM).
> 
> I can think of a couple of primary usage models:
> 
> 1) Check-in/check-out a set of objects on which a thread will perform  
> repeated operations.
> 
>     This checkout should be guarded and implicitly turns on a caching  
> policy.  (save oncheck-in, write-through for state durability, etc)
> 
> 2) A read-only pool of objects shared by many threads (in-memory  
> objects w/ on-disk indices)
> 
> 3) There is a variation on #2 which allows for updates to the pool via  
> a single-writer.  The user is responsible for using the single-writer  
> API to avoid conflicts.  This requires that updates to cached objects  
> not be atomic; or that caching is turned off during updates so that it  
> is (there may still be race conditions that violate isolation/ 
> atomicity here).
> 
> To support either case we need something beyond what is already there;  
> for example a model that provides cheap mutexes via the DB instead of  
> just in-memory?
> 
> 
> For example, a web session is rendering a set of objects to a client  
> and updating the client on each request.  Rather than hitting the DB  
> on every web transaction, we want to allow the session to run for  
> awhile, keeping that state in memory, then commit the changes when a  
> 'commit' button is clicked, or perhaps we want a write-through policy  
> that keeps track of changes by only hitting the DB when the user has  
> changed something.
> 
> In my application a user might be editing a questionnaire and the UI  
> provides an explicit indication that the questionnaire is being  
> edited.  A session 'checks out' the questionnaire.  The questionnaire  
> has an association with its root questions, which have an association  
> with their sub-questions, etc.  Ideally this whole tree would be  
> checked out; either based on a user-provided function or some  
> declaration that defines the checkout set.  I'd use a write-through  
> policy so work was never lost and have the application layer implement  
> any needed undo/reset functionality.
> 
> Any other use cases?
> 
> In short, there are alot of ways to get into trouble with this  
> mechanism, so I think it behooves us to spec out an API for using this  
> facility that is reasonably robust and can give people canned ways to  
> gain the performance benefits without putting too many holes in their  
> feet.
> 
> Thanks,
> Ian
> 
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel