[elephant-devel] Lisp Backend Architecture

Leslie P. Polzer sky at viridian-project.de
Tue Oct 28 18:40:07 UTC 2008


> I am hoping to clarify some of the prior discussions[1] about the native
> Lisp backend for Elephant and propose a basic architecture.  Hopefully we
> can modularize development so if somebody wants to hack for a few days on
> the backend he can avoid being overwhelmed.

Glenn Tarcea and me are currently drafting a Skiplist-based
key-value storage in Lisp. This will be generic enough on
the top-level to plug in a btree later if needed.

You're welcome to join the effort if you wish.


> Multiprocess support: What features do the major lisps have for locking and
> concurrency?  What features are we missing from C/linux--what does BDB do in
> this regard that is hard to do in Lisp?  I do not know too much about
> implementing efficient multi-threaded lisp programs where there is a lot of
> contention.

My guess is that the most important thing is optimizing the data structure
for partial locking.

There are also a lot of scientific papers on database concurrency,
none of which I have read so far.

I'm busy enough getting a single-thread solution up first. ;)


> Distribution: Are others interested in making the backend distributed?  I
> prefer the design to allow adding an efficient distributed architecture at a
> later date.

Definitely. Considerations? Where do we need to take care?


> Custom indexing: A lot of discussion has gone into the best implementation
> practice for BTrees.  But what about other indexing structures?

See above, Skiplist.


> Multidimensional indexing

You're losing me right here, so it would be good if you joined
our skiplist effort by drafting an API that will support both
traditional and multidimensional indexing approaches.


> There are other indexing needs as well.  Document-level search is another
> common case.  I imagine an efficient search library is beyond our capacity,
> but how could we make the database suited to implement search on top of the
> thing?

You mean full-text indexing like offered by Montezuma?


> Logging package with generic undo/redo logging support.  It would be
> customizable but it would not depend on other code from the Lisp backend.  A
> database could plug into the log by implementing undo, redo, and
> serialization functions.  By modularizing logging, it should make it easier
> to hack on it without being familiar with the rest of the backend.  It will
> also be reusable, for what it's worth.

Ideally I'd like the major parts to be generic enough to make a library
of them.


> A persistent heap.  Beneath indices and data storage would be a heap-on-disk
> layer with transaction support.  The heap would have methods for reading,
> writing, and locking sequences of bytes.  It would also handle   The
> persistent heap alone would qualify as a database and might be useful as a
> basis for other DB projects or experiments.  Cachine would probably not be
> implemented at this level, though that would be easiest.  It may be possible
> to implement the transactional heavy lifting for the persistent heap and
> make it extensible enough to be used throughout.

So you also want to collect the garbage at this level?


> B-Trees.  A MOP-based implementation of B-Trees may be customizable enough
> to plug into the the rest of the system with a few additional method
> declarations for the persistent, transactional version.  Specialized methods
> would access the persistent heap for reads and writes.  Locking of the sort
> done by BDB (level 2 etc.) could probably be accomplished in these methods
> specialized for our persistent B-Tree.
>
> Bkd-Trees and other indexing structures could be implemented similarly to
> B-Trees.

Yes, but let's strive for a common API to abstract key-value storage.


> How does this sound for a basic architecture?

It's fine, but I'd like to emphasize development of a rudimentary
on-disk data structure first. After that we can add generic transactions,
establish an API for the underlying binary heap and so on.

What do you think?

  Leslie





More information about the elephant-devel mailing list