[elephant-devel] Why does an empty BDB repository take 40 MB?
Alex Mizrahi
killerstorm at newmail.ru
Thu Apr 22 10:20:01 UTC 2010
??>> SLSIA. I create a brand new BDB-backed elephant repository
??>> and it takes up 40MB of disk space. Why?
LPP> Elephant creates some btrees as part of repository initialization.
LPP> What you're seeing is probably a combination of BDB log files (try to
LPP> invoke db_archive with the -d switch[1]) and preallocated disk space
LPP> (to avoid excessive fragmentation when the tree is filled).
You're right, but files it pre-allocates are sparse, which means it's much
less of a problem w.r.t. disk space.
Here're results of my investigation (as seen in comp.lang.lisp):
----
BerkeleyDB creates large files for its work:
alex at debetch:~/foobla$ ls -l
-rw-r----- 1 alex alex 24576 2010-04-21 01:37 __db.001
-rw-r----- 1 alex alex 1327104 2010-04-21 01:37 __db.002
-rw-r----- 1 alex alex 26222592 2010-04-21 01:37 __db.003
-rw-r----- 1 alex alex 98304 2010-04-21 01:37 __db.004
-rw-r----- 1 alex alex 557056 2010-04-21 01:37 __db.005
-rw-r----- 1 alex alex 253952 2010-04-21 01:37 __db.006
-rw-r----- 1 alex alex 40960 2010-04-21 01:33 %ELEPHANT
-rw-r----- 1 alex alex 16384 2010-04-21 01:32 %ELEPHANTDUP
-rw-r----- 1 alex alex 16384 2010-04-21 01:33 %ELEPHANTOID
-rw-r----- 1 alex alex 10485760 2010-04-21 01:33 log.0000000001
But those files are sparse, they do not eat space on disk until they are
populated:
alex at debetch:~/foobla$ du -h
2.0M .
alex at debetch:~/foobla$ du -h *
12K __db.001
1.1M __db.002
296K __db.003
24K __db.004
364K __db.005
16K __db.006
40K %ELEPHANT
16K %ELEPHANTDUP
16K %ELEPHANTOID
104K log.0000000001
alex at debetch:~$ tar czf foobla.tgz foobla
alex at debetch:~$ ls -l foobla.tgz
-rw-r--r-- 1 alex alex 86029 2010-04-21 01:40 foobla.tgz
Well, if you use filesystem which supports sparse files.
If you don't like this anyway, you can configure BDB to allocate smaller
files.
File __db.003 seems to be related to cache size, default cache in
config.sexp is 20MB.
If you set it to 256KiB (default for BDB) in my-config.sexp:
(:BERKELEY-DB-CACHESIZE . 262144)
You won't have that large file:
-rw-r----- 1 alex alex 24576 2010-04-21 01:54 __db.001
-rw-r----- 1 alex alex 385024 2010-04-21 01:54 __db.002
-rw-r----- 1 alex alex 335872 2010-04-21 01:54 __db.003
-rw-r----- 1 alex alex 98304 2010-04-21 01:54 __db.004
-rw-r----- 1 alex alex 557056 2010-04-21 01:54 __db.005
-rw-r----- 1 alex alex 253952 2010-04-21 01:54 __db.006
-rw-r----- 1 alex alex 40960 2010-04-21 01:54 %ELEPHANT
-rw-r----- 1 alex alex 16384 2010-04-21 01:54 %ELEPHANTDUP
-rw-r----- 1 alex alex 16384 2010-04-21 01:54 %ELEPHANTOID
-rw-r----- 1 alex alex 10485760 2010-04-21 01:54 log.0000000001
There is also 10 MiB log file. It is default size for BDB. There is no good
way to tweak in Elephant, but there is BDB API for this, so it's possible to
implement it, if you think it is really needed.
----
More information about the elephant-devel
mailing list