[elephant-devel] Lisp Btrees: design considerations
Alex Mizrahi
killerstorm at newmail.ru
Wed May 14 15:36:16 UTC 2008
IE> I was quite wrong-headed on the performance point. The biggest time
IE> cost in database operations is time spent moving data to/from disk, in
IE> comparison CPU time is quite small.
this depends on size of a database -- relatively small ones can be cached in
RAM, and with asynchronous writes disk won't be a bottleneck.
also note that RAM gets larger and cheaper each year, so current "relatively
small ones", on scale of several gigabytes, were considered big some years
ago.
also database are often optimized to minimize HDD seek times. but for now we
have quite affordable Solid State Drives which do not have any seek time
overhead at all and are pretty fast, so such optimization do not have any
sense anymore.
also elephant's workload is likely to be very different from relational
database's one -- with RDMBS people often try to do as much work as possible
with single optimized query, while Elephant users tend to do lots of small
queries -- because it's, um, easy. also Elephant lacks optimization tricks
that RDBMS can do, so it has to rely on crude processing power.
so i believe that Elephant will be CPU-bound for most of uses/users, and any
complications like compression will make it slower.
people who will use Elephant for storage and rapid retrieval of some
terabytes of text will find compression a cool feature, but i seriously
doubt that there are many people who consider elephant a tool for such
stuff.
if you want to optimize something, better make sure that small objects, like
numbers, object pointers, small strings etc. can be stored/retrieved with as
little overhead as possible -- i think most values in database are such.
More information about the elephant-devel
mailing list