[elephant-devel] Cached objects

Ian Eslick eslick at media.mit.edu
Sat May 17 15:54:39 UTC 2008


As a quick study, I'm prototyping the following model for cached  
slots.  Turns out to be fairly easy to implement.

Objects that have cached slots have an extra boolean that indicates  
whether the cached slots are caching or not.  When you turn on  
caching, all read writes are strictly to memory, when you turn caching  
off they behave like normal slots.  You can save the contents to disk  
without turning off caching.

;; Can you think of the right names for the refresh & save commands?
(refresh instance) - refresh all cached slots on the instance from the  
store
(enable-caching instance) - tell the object to perform cached reads/ 
writes
(save instance) - push all slots to disk
(disable-caching instance) - tell the object to behave like a standard  
instance.  Flushes to disk.

*cached-instance-default-mode*
(will set the caching mode for new instances automatically, will not  
effect existing instances)

;; Convenience macro
(with-caching (:type instance list)
   all cached slots of the instances are cached, followed by a save on  
exit
   )

Transaction Behavior:
---------------------

If you use the discipline of using with-caching before the first read  
of a cached instance's slot inside with-transaction, then you get the  
following cool behavior in a multi-threaded scenario:
- txn one reads the cached values (setting read locks)
- txn one does various ops in memory
- txn one then commits the instance values (grabbing write locks)

- txn two in running in parallel and does refresh, grabbin read locks
- txn two commits after txn one and is aborted and restarted.

- txn two is guaranteed to get the latest data by virtue of the refresh.

In this mode of operation, I believe that we can guarantee that the  
ACID properties are maintained at transaction boundaries. There are  
caveats however:

Complications with Indexing:
----------------------------

- Slot indexing is a problem.  Completely cached slots will be out of  
sync with their index because you can't keep the index in-sync with  
the cached slot, so the API is broken (add instance, do query, new  
instance not in query result!).  This can be fixed with a write- 
through mode on the indexed slots.

Tradeoff Requests for Comment:
------------------------------

- Reads and writes could set a dirty bit on the instance.  I'm  
debating this one.  The nice thing is if all you do is read an object  
in a transaction, then you don't cancel other parallel transactions  
that are only reading.  However setting dirty on each slot access  
isn't terribly cheap.  Easy enough to profile this when we have some  
tests.  For now, my guess is that the probability of two threads  
operating on the same objects is low given the web-oriented usage of  
elephant.

- Profliferation of modes.  I can now see 4 modes of operation - is  
this too complicated?  no caching, all-in-memory, write-through  
caching of everything, write-through caching of indexed/cached slots.   
Perhaps we have a standard mode (all or nothing, no indexing allowed)  
and provide an advanced section for people with more complex  
requirements.  There are also performance issues (lots of power eats  
lots of performance as Alex has been reminding us:)

- Would anyone like to volunteer to review the API and to test this?   
It's fairly orthogonal to everything else in the system, so shouldn't  
interfere with other testing/use that is happening.  Leslie?

Thank you,
Ian





More information about the elephant-devel mailing list