[elephant-devel] Writing manuals
Ian Eslick
eslick at csail.mit.edu
Tue Apr 24 21:01:06 UTC 2007
Hello Elephants,
As I mentioned to Robert the other day, writing a manual is a drag
but it does force you to visit all the dark corners of your system
and make sure you really understand what is going on. While writing
the Berkeley DB internals section today I discovered an interesting
fact.
The default Berkeley DB memory cache size is 256k bytes. That is 64
4k pages and each page has about 100 key-value pairs for small key
and value size (20 bytes). That is a cache big enough for 6,400 key-
value pairs. If you regularly walk over a full index, you will
certainly thrash your cache. So if you have many thousands of
objects, with several indices, your working set could easily exceed
this value.
For example, if you had an indexed class with two slots and two
indices on those slots, each object instance would eat up 3 small key-
value pairs plus the number of persistent slots per object. So you
could cache the values and indexes for a little over 1k objects.
I've added a user parameter that allows you to tune this value. I
also set the default value to 10MB for each instance of a data store
(most applications use 1 or a very small number). This is more than
large enough for most casual applications (10's of thousands of
objects, including indices) but small enough to not be a big deal
with today's generous memory systems. However, if you want to change
it, you can use the user parameter :berkeley-db-cachesize in the my-
config.sexp configuration file. Existing database will not be
effected by changes in this parameter, only new ones. (Advanced
users can delete the environment files of the database, but leave the
database intact to upgrade to a new value)
I also added a :map-using-degree2 command that takes a boolean
parameter and tells the Berkeley DB data store whether to allow
btrees that are being mapped to also be read by other transactions.
The implicates are that two processes can concurrently read an index
and commit without one aborting the other because of conflicting read
locks when each one writes different key-value pairs. This defaults
to true and is primarily a performance enhancing option, but if you
require that only one write can happen to a btree at a time, you can
disable this by default.
If you want to specialize a given map operation's transaction, you
can wrap it in a with-transaction that uses different parameters
(:read-uncommited, :degree-2, etc). The new manual documents these
options in Chapter 4, the BDB overview.
Cheers,
Ian
More information about the elephant-devel
mailing list