[elephant-devel] Cached objects
Ian Eslick
eslick at media.mit.edu
Sat May 17 15:54:39 UTC 2008
As a quick study, I'm prototyping the following model for cached
slots. Turns out to be fairly easy to implement.
Objects that have cached slots have an extra boolean that indicates
whether the cached slots are caching or not. When you turn on
caching, all read writes are strictly to memory, when you turn caching
off they behave like normal slots. You can save the contents to disk
without turning off caching.
;; Can you think of the right names for the refresh & save commands?
(refresh instance) - refresh all cached slots on the instance from the
store
(enable-caching instance) - tell the object to perform cached reads/
writes
(save instance) - push all slots to disk
(disable-caching instance) - tell the object to behave like a standard
instance. Flushes to disk.
*cached-instance-default-mode*
(will set the caching mode for new instances automatically, will not
effect existing instances)
;; Convenience macro
(with-caching (:type instance list)
all cached slots of the instances are cached, followed by a save on
exit
)
Transaction Behavior:
---------------------
If you use the discipline of using with-caching before the first read
of a cached instance's slot inside with-transaction, then you get the
following cool behavior in a multi-threaded scenario:
- txn one reads the cached values (setting read locks)
- txn one does various ops in memory
- txn one then commits the instance values (grabbing write locks)
- txn two in running in parallel and does refresh, grabbin read locks
- txn two commits after txn one and is aborted and restarted.
- txn two is guaranteed to get the latest data by virtue of the refresh.
In this mode of operation, I believe that we can guarantee that the
ACID properties are maintained at transaction boundaries. There are
caveats however:
Complications with Indexing:
----------------------------
- Slot indexing is a problem. Completely cached slots will be out of
sync with their index because you can't keep the index in-sync with
the cached slot, so the API is broken (add instance, do query, new
instance not in query result!). This can be fixed with a write-
through mode on the indexed slots.
Tradeoff Requests for Comment:
------------------------------
- Reads and writes could set a dirty bit on the instance. I'm
debating this one. The nice thing is if all you do is read an object
in a transaction, then you don't cancel other parallel transactions
that are only reading. However setting dirty on each slot access
isn't terribly cheap. Easy enough to profile this when we have some
tests. For now, my guess is that the probability of two threads
operating on the same objects is low given the web-oriented usage of
elephant.
- Profliferation of modes. I can now see 4 modes of operation - is
this too complicated? no caching, all-in-memory, write-through
caching of everything, write-through caching of indexed/cached slots.
Perhaps we have a standard mode (all or nothing, no indexing allowed)
and provide an advanced section for people with more complex
requirements. There are also performance issues (lots of power eats
lots of performance as Alex has been reminding us:)
- Would anyone like to volunteer to review the API and to test this?
It's fairly orthogonal to everything else in the system, so shouldn't
interfere with other testing/use that is happening. Leslie?
Thank you,
Ian
More information about the elephant-devel
mailing list