[elephant-devel] Writing manuals

Ian Eslick eslick at csail.mit.edu
Tue Apr 24 21:01:06 UTC 2007


Hello Elephants,

As I mentioned to Robert the other day, writing a manual is a drag  
but it does force you to visit all the dark corners of your system  
and make sure you really understand what is going on.  While writing  
the Berkeley DB internals section today I discovered an interesting  
fact.

The default Berkeley DB memory cache size is 256k bytes.  That is 64  
4k pages and each page has about 100 key-value pairs for small key  
and value size (20 bytes).  That is a cache big enough for 6,400 key- 
value pairs.  If you regularly walk over a full index, you will  
certainly thrash your cache.  So if you have many thousands of  
objects, with several indices, your working set could easily exceed  
this value.

For example, if you had an indexed class with two slots and two  
indices on those slots, each object instance would eat up 3 small key- 
value pairs plus the number of persistent slots per object.  So you  
could cache the values and indexes for a little over 1k objects.

I've added a user parameter that allows you to tune this value.  I  
also set the default value to 10MB for each instance of a data store  
(most applications use 1 or a very small number).  This is more than  
large enough for most casual applications (10's of thousands of  
objects, including indices) but small enough to not be a big deal  
with today's generous memory systems.  However, if you want to change  
it, you can use the user parameter :berkeley-db-cachesize in the my- 
config.sexp configuration file.  Existing database will not be  
effected by changes in this parameter, only new ones.  (Advanced  
users can delete the environment files of the database, but leave the  
database intact to upgrade to a new value)

I also added a :map-using-degree2 command that takes a boolean  
parameter and tells the Berkeley DB data store  whether to allow  
btrees that are being mapped to also be read by other transactions.   
The implicates are that two processes can concurrently read an index  
and commit without one aborting the other because of conflicting read  
locks when each one writes different key-value pairs.  This defaults  
to true and is primarily a performance enhancing option, but if you  
require that only one write can happen to a btree at a time, you can  
disable this by default.

If you want to specialize a given map operation's transaction, you  
can wrap it in a with-transaction that uses different parameters  
(:read-uncommited, :degree-2, etc).  The new manual documents these  
options in Chapter 4, the BDB overview.

Cheers,
Ian



More information about the elephant-devel mailing list