[elephant-devel] QDBM Support

Ian Eslick eslick at media.mit.edu
Wed Feb 13 21:06:07 UTC 2008


Re: Tokyo Cabinet

The API is nearly identical to BDB's.  I think a TC version of the  
datastore would be pretty easy to do.  The only way it makes sense to  
me to do this is to deprecate the BDB data store as of the next major  
release.   Any thoughts on this?

Depreciation:

Speaking of which, what is the thinking on the CL-SQL store?  With  
postmodern, there is a SQL interface - do we want to maintain CL-SQL  
long term?  It does cover more SQL backends, including the easy-to-use  
SQLite, but it's another fork to maintain.  I'm agnostic, but I  
thought I'd toss that out for Robert to comment on.

Re: DirectoryStorage

That's essentially the idea behind Rucksack, if I recall correctly.   
Finding a way to leverage the existing filesystem is a good idea.   
That may be a way to bypass the low-level locking, Btree design,  
paging and caching issues in the near term and perhaps the long term.   
Does anyone have a sense of the performance implications of this vs.  
BDB?

Some potential issues:
- Lots of file handles being created and destroyed
   (i.e. When walking an index, to get a primary value, you have to  
open and
    close each object's file)
- Lots of open file handles!
- Efficient index implementation
- Secondary indices
- We have to update whole objects on each commit, not just slot values  
as today.  This may or may not matter given that an object usually  
lives on one BDB page and locking in BDB is done at the page level...
- I think we still need a C interface to a POSIX function to lock a  
file explicitly; does Windows have a similar interface?

Ian


On Feb 13, 2008, at 1:33 PM, Ben wrote:

> one (perhaps insane) idea to make an all-lisp backend easier to
> implement was to leverage the underlying file system ala ZODB
> directory storage, since the file system is probably using B-trees
> anyways.  there are fairly good architecture docs on
>
> http://dirstorage.sourceforge.net/
>
> tokyo cabinet looks good too.
>
> b
>
> On Feb 13, 2008 7:41 AM, Ian Eslick <eslick at media.mit.edu> wrote:
>> In general, I'm with Henrik on this.  I'd rather see us get Elephant
>> to a reasonable degree of feature completeness before we start to add
>> more non-lisp datastore functionality.  You can use postmodern for
>> licensing purposes and BDB for performance.
>>
>> The answer to all of this, I think, is having a native lisp version
>> that has BDB's performance and no licensing restrictions.  Then
>> supporting the other two becomes: Postmodern for a higher degree of
>> reliability as well as for distributed systems and BDB for legacy
>> reasons.
>>
>> I have a pretty good idea in my head of what an all-lisp backend
>> requires and having one would lay to rest all of these discussions of
>> bringing up "yet another backend".  Edi Weitz and I discussed
>> collaborating on this, but unfortunately he had some other projects
>> that took priority.
>>
>> Is there a small critical mass of people out there that care enough
>> about this that they'd be willing to contribute to such a project?  I
>> don't have the time to do it on my own, but if we broke it up into
>> small projects over the next handful of months, I don't think it's a
>> ton of work.  I can put in a solid chunk of integration work in mid  
>> to
>> late April.
>>
>> So what is involved?
>>
>> The tricky problems I've discovered so far are:
>> - An efficient model of BTree-like storage for Elephant
>>   1) BDB-like paged data + explicit page cache + operations over  
>> fields
>>   2) Something more customized?
>> - Efficient pointer-based indexing (BTtree plus ptrs to data in main
>> BTree)
>> - Performing sorting and searching on serialized data rather than
>> having to
>>   deserialize to sort as in the clsql backend (required to do BTree
>> insertions)
>> - Transaction/logging architecture; how to store transaction data,
>> track conflicts, etc.
>>   (at lisp layer, in page cache ala BDB?)
>>   multi-thread and multi-process safe?
>> - locking to enable transactions on all 3 platforms; multi-process  
>> safe?
>>   (Is there a free library that has a C library that does this  
>> already,
>>    I think having a simple library that compensates for some of the
>> missing
>>    features in lisps is fine)
>>
>> Some additional considerations:
>> - Do we add support for persistent heap garbage collection?
>> - Do we want to add supports for large persistent sets?
>> - Do we want a server mode for N:1 distributed transactions?
>>
>> This is by no means a trivial design, but I think if we sketched out
>> the architecture there are a set of subsystems that could be made
>> somewhat independent:
>> - BTrees and disk storage
>> - Database maintenance ops: (reconstruct DB from log files, dump db,
>> optimize, etc)
>> - transaction support and logging
>> - low-level locking library
>> - online garbage collection
>>
>> Cheers,
>> Ian
>>
>>
>> On Feb 13, 2008, at 10:11 AM, Henrik Hjelte wrote:
>>
>>> I had never heard of this project, but I it seems that Tokyo Cabinet
>>> describes itself as fast, has transactions and can handle multiple
>>> clients which is good. And it has a tcp/ip interface and protocol so
>>> you wouldn't even need uffi/cffi to interface it from Lisp. Tokyo
>>> cabinet seems to map to the bdb model good, so it should probably be
>>> easier to do an interface than the sql interfaces. One observation
>>> though, do we need yet another backend at this time, there are other
>>> things to fix first on my personal wishlist.
>>>
>>> Henrik
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel at common-lisp.net
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list