[elephant-devel] Cached objects

Sun May 18 13:33:48 UTC 2008

On May 18, 2008, at 6:05 AM, Alex Mizrahi wrote:

> IE> If you use the discipline of using with-caching before the first  
> read
> IE> of a cached instance's slot inside with-transaction, then you  
> get the
> IE> following cool behavior in a multi-threaded scenario:
> IE> - txn one reads the cached values (setting read locks)
> IE> - txn one does various ops in memory
> IE> - txn one then commits the instance values (grabbing write locks)
>
> IE> - txn two in running in parallel and does refresh, grabbin read  
> locks
> IE> - txn two commits after txn one and is aborted and restarted.
>
> IE> In this mode of operation, I believe that we can guarantee that  
> the
> IE> ACID properties are maintained at transaction boundaries.
>
> how is that? both txn one and two grab read locks, so they don't  
> block each other, right?
> then txn one modifies slots, and txn two immidiately sees changes,  
> that breaks isolation, doesn't this?
> only way to avoid collisions is to make even read locks exclusive,  
> but that will totally kill paralellism -- even read-only  
> transactions working with same objects will not be running  
> concurrently.

You are quite right!  It's been a long week, so please bear with me on  
this.  :)  I was thinking about my original write-through mode, which  
would catch this cases as the writes will block the writer (because  
all read locks were grabbed at the beginning of the session).  You  
wouldn't want to use caching mode in the presence of high degrees of  
sharing anyway - so if we grabbing write locks (i.e. exclusive read  
locks) just for the cached objects the cost wouldn't be too high in  
concurrency but we'd make sure these semi-independent operations were  
isolated.  Any shared objects, especially those rarely written, should  
be handled in a write-through mode.

I'm sure there are still holes.  Would you like to help think through  
a policy that we could throw into a macro wrapper to make this  
easier.  I'd like to have one or two out-of-the-box use cases for this  
and leave the rest to the advanced user.

Hmmm...you could also engineer around some of the above tradeoffs by  
implementing a policy where a thread or slot is used to ensure that  
any thread that will side effect a set of fully cached objects first  
has to grab an exclusive lock on a designated slot (essentially using  
the DB to grab a semaphore, but ostensibly at a lower cost in time and  
complexity) prior to the side effecting operations in memory.

> i believe that "the right way" to do such stuff is thread/ 
> transaction local cache -- i.e. each transactions bounds *slot- 
> cache* (e.g. hash-table), and all slot reads/writes work with it.
> slot reads/modifications performance will be somewhat inferior, but  
> it won't break concurrency.

I agree with you here.  However, while they are constant time, most  
hash ops are rather expensive (it was the biggest bottleneck in the  
serializer) and clearing or allocating hashes on each transaction is  
really expensive in my experience.  If you only had a dozen or so  
slots, even an alist would be faster...  I think the usual tradeoff is  
a 20-deep alist search is the same as a single hash fetch.

> IE> - Slot indexing is a problem.  Completely cached slots will be  
> out of
> IE> sync with their index because you can't keep the index in-sync  
> with
> IE> the cached slot, so the API is broken (add instance, do query, new
> IE> instance not in query result!).  This can be fixed with a write-
> IE> through mode on the indexed slots.
>
> partially this can be fixed by doing caching on btree level rather  
> than on slot level.

Please explain further, I'm not sure I follow you here.

Thank you for the input!
Ian

> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel