[elephant-devel] Re: Lisp backend

Sun Mar 2 09:47:51 UTC 2008

> The quad-tree is definitely not a B+Tree, nor one that solves any of
> the issues of writing to disk.

Well, so much for superficial research. Sorry.

> It's easy enough to find a reference
> implementation of a B+Tree and re-implement it; as long as we've
> thought through all the tradeoffs (I think Henrik made this point).

Speaking of which, I think the Right Thing is developing the B+Tree
library separately from Elephant, so others will be able to use it
and to achieve weak coupling.

> Part of the challenge is being able to read/write and cache the
> underlying disk storage.  Do we write directly into binary pages in
> memory, and read/write them, or do we manage all this in lisp and
> serialize lisp BTree nodes?  We can't serialize the entire BTree
> whenever it changes, so we need to keep track of nodes and serialize
> them whenever they change.  However, nodes in lisp memory are going to
> have variable size and we don't know how big they will be until they
> are serialized.  The tradeoffs here can get very interesting.

Yes, let's gather questions like these as proposed on a central site.
TC and BDB support variable-length values, so we can just look at their
solution for this problem.

> One of the biggest challenges is efficiently managing concurrency.  If
> you want multithreaded or multi-process operation, and you want
> multiple transactions to proceed concurrently then you need some way
> for all the threads of control to coordinate who is writing what data
> so that transaction A doesn't commit data that transaction B depends
> on, when transaction B is committing after A.

The PostgreSQL manual has some interesting information on this topic.

> Further, if you want to support multiple machines, then this pool of
> locks needs to be distributed or exported via a server protocol.

Let's stop here. We can think about this later.

> One problem is that lisp doesn't provide native access to any locking
> mechanism.  I think it's OK to use uffi/cffi to talk to a standard C
> library that does have these low level calls though.

Yes.

> The other approach to fine-grained locking is to use speculative
> concurrency (roughly what Rucksack calls versioning).  This means that
> we assume there are no conflicts, don't bother to lock anything, copy-
> on-writes and invalidate transactions at commit time that do have
> conflicts (i.e. that read and operated on anything that was written).

I believe the canonical term for this is "Optimistic Concurrency Control".

  Leslie