[elephant-devel] Choice of back-end store
Alex Mizrahi
killerstorm at newmail.ru
Sat Jan 3 01:19:25 UTC 2009
> Could someone point me to any discussions WRT to choice of back-end?
> I'm looking at BDB and Postmodern. What are advantages/disadvantages of
> either one?
think like BDB is a baseline implementation, and Postmodern is an option for
special cases.
Postmodern is designed to work absolutely transparently with multiple
threads,
processes and machines (and work absolutely transparently in general).
we think it is good for web sites -- it should scale pretty well if needed.
while this is also theoretically possible with BDB as well, as i understand
(i'm definitely not a BDB expert) it requires substantial amount of
tweaking,
or simply won't work in some cases..
the price we pay for flexibility is communication overhead -- each request
travels through process boundaries, and this takes time.
besides that, Elephant's semantics is define in BDB terms, so Postmodern
backend is sort of emulation, and it has substantial amount of limitations
and weirdnesses. thus, if not sure, use BDB :)
stores are more-or-less compatible, so you don't need to choose before
you start development -- you can start development with BDB, for example,
and then try out postmodern if you think it might suite you better.
> I'm guessing that Postmodern offers an opportunity of using and SQL
> querying
> against PostgreSQL backend for out-of-process querying and such
> (does Elephant model make this really possible/practical?),
it is possible, but not practical. might be useful for debugging and
stuff like that.
> whereas BDB is perhaps faster
it depends how you measure. if you do lots of queries on small tables, than
indeed BDB
will be much faster. if you do relatively small count of queries on larger
tables, then
it depends..
> and you can build/use BDB replication tools.
yep, with BDB you can use BDB tools, with PostgreSQL you can use PostgreSQL
tools :)
what fits you better depends on nature of your project and your background.
another significant difference is concurrency model -- in PostgreSQL it is
MVCC and it works almost transparently (except that conflicting transactions
get restarted),
BDB's model is lock based (by default) and it works in quite nasty way -- if
you touch too
many objects, you'll be out of locks. so, for example, if you want to
serialize whole database,
with PostgreSQL it is trivial, but with BDB it is not. (you can enable MVCC
in BDB too, but
i'm not sure it works as good as in PostgreSQL.)
More information about the elephant-devel
mailing list