[elephant-devel] Backend musings
Ian Eslick
eslick at media.mit.edu
Fri Feb 29 13:48:30 UTC 2008
It strikes me that Tokyo Cabinet's only major benefit is have a better
licensing model. Getting it to the same state as BDB would be a
significant project (and it doesn't support Windows, I believe).
Mirroring Robert's sentiments, I think we'd be far better off spending
limited developer cycles on a lisp-only solution running that solves
both licensing, integration, performance, usability and other
headaches experienced with our ffi or socket based solutions.
An elephant data store has to:
- Support persistent slot writes
- Expose a persistent key-value store, indexed key-value store and
iterator/cursor interfaces over them
- Support a transaction model that guarantees the ACID properties
- Work correctly in a threading model
- Ideally would also work in a multi-process, shared-memory scenario
I think the big design decisions for a lisp data store include:
- Backing-store model (prevalence-style logging, binary paging,
Rucksack style maps, etc)
- Multi-threaded transaction support (page/object locking vs. per-txn
replication)
- Multi-process support?
It would also be nice, eventually, to have a lisp server model where
we can launch a lisp instance on the same or a remote machine to serve
access to the underlying store.
My latest thinking is that we can spend disk space to save time/
complexity and assume that we have lots of main memory available.
This can allow us to avoid messing with fixed-sized pages and locking.
So an initial approach might be the following:
- BTree-style nodes and slot value writes are duplicated in a per-
transaction memory log.
(btree ops are compressed into edit operations, but the altered
page is also duplicated for later reads)
- Any reads to these subsystems first look up data in this transaction
cache
- We serialize commit ops and cancel active transactions that have
read or written data
written by the commiting transaction.
- Thus, on commit, we just write to the log file, update the btree
pages, and write to the slot store
We can make serialization of btree nodes easier by just serializing
the lisp structure and having bins of differently sized disk regions
to write them to rather than maintaining fixed sized pages. Later, we
could choose to enhance the system by converting to a low-level read/
write into the C byte stream that we fetch from files.
Just a few thoughts...
Ian
PS - I don't see B+Trees in CL-CONTAINERS. Is it in there with
another name? Also, we need a B+Tree that actually is mapped to a
disk file rather than into memory as seems to be the case with most of
cl-containers
On Feb 29, 2008, at 5:45 AM, Leslie P. Polzer wrote:
>
> I have been thinking about new backend possibilities, and
> would like to know your opinion.
>
> Tokyo Cabinet would require us to write a FFI for it.
> I don't know what it does to prevent deadlocks; probably
> nothing, as I have looked at its B tree code and have
> found just mutexes without any special handling
> whatsoever.
>
> iI'm going to ask the author for more clarification),
> a slight performance hit and possibly a less convoluted
> glueing code than the FFI solution. Plus, one could
> offload the storage to another machine, of course.
>
> Then there would be the possibility of using the
> B+Trees provided by CL-CONTAINERS. This would
> be a Lisp base upon which one could build upon,
> layering transactions and file storage above.
>
> What do you think?
>
> Leslie
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list