[elephant-devel] Elephant & Berkeley DB questions

Ian Eslick eslick at csail.mit.edu
Sun Feb 26 19:23:46 UTC 2006


I thought I'd share these questions with the list.

A quick design question for Ben Lee:

In your notes on upgrades/fixes you mention a separate OID table.  I'm
finding that when I inverse index into a bunch of persistent objects the
serialization of the type is taking up significant duplicated area in my
DB.  I presume that storing an OID->TYPE field separately from the OID
will make the storage of persistent objects more efficient while adding
an additional disk seek for each first retrieval of the object from a
btree key or value field.   You argued that this would improve
change-class. 

Right now change-class works fine when the system stays online, but if
you exit and restart and there are unchanged class instances serialized
in the DB, I assume they will be orphaned and not conform to the new
type as lisp has lost the class-change dependency information?  Since
I'm thinking of making the OID table change for 0.6.1 I wanted to plumb
your thoughts on how this effects the change & redefine class protocols.

Several performance / stability questions:

1) I'm seeing HUGE growth in the database when I build inverse indices
(a 64MB text file, tokenized into words and inverse indexed in 10-15
word chunks is over 2G on disk!  Since most of this storage is archival
except upon retrieval which tends to be to small subsets, I'm not overly
worried at present but when I start indexing complete documents it's
going to become a problem.  Any thoughts/suggestions out there?  Perhaps
it's a page granularity issue?

2) I occasionally find, even in user code, that if I'm not careful with
my transactions and cursors I lock my lisp up in the Berkeley locking
code and have to kill the process.  Berkeley DB claims to have a
deadlock detection process that will yield a deadlock error when a DB
routine attempts to lock a page, but I've never made it work with
elephant and I'd be happy to implement transaction retries and deal with
locking errors but I can't seem to get the DB code to do this.  Anyone
else have luck?  While I can mostly code around this now by being extra
careful, it's annoying to have what feels like such a fragile system.

In 0.6.1 I hope to add checks so that a non-transactional cursor
operation complains on the lisp side instead of relying on the DB
library to behave properly so I don't have to keep fighting this but
I'll wait for 0.6.0 to stabilize and ship first.

Thanks!
Ian



More information about the elephant-devel mailing list