[elephant-devel] Lisp Btrees: design considerations

Alex Mizrahi killerstorm at newmail.ru
Fri May 16 14:14:18 UTC 2008


 ??>> or it will matter if people are doing lots of linear reads -- but it's
 ??>> hard for me to imagine such database access patterns.

 RLR> I think linear reads are quite common.  They are, after all, why we
 RLR> have implemented range queries.  Moreover, the act of reading all the
 RLR> data into memory is quite common if you are using a "Object
 RLR> Prevalance" model.

exactly. if we are speaking of a server, it reads all the data into memory
and then works for monthes for it, so I/O performance is near to irrelevant 
for it.

 RLR> find the best one.  In many cases, this is indeed a linear scan;
 RLR>  for example many join operations can best be performed by performing
 RLR> a linear scan of at least one of the tables in question. The aggregate
 RLR> functions (sum, min, max, avg, count) are another example.  Another
 RLR> typical example is "access all page views of my website from Thursday
 RLR> to Friday".

if we've agreed that I/O performance matters only for large databases,
linear scans will take very significant amount of time in any case.
it could be two hours without optimization and a hour with, but for me 
that's not that large difference:
if one can wait a hour to get results, can't he wait two hours then?

 RLR> more objects within an enterprise to avoid I/O costs.  Perhaps you can
 RLR> post a description of your database usage pattern, to explain why it
 RLR> is dominated by point queries.

simply because it's interactive -- reading full table just to do a join and 
get
few values out of there is simply non-acceptable for interactive 
applications,
because as table grows query will take prohibitive amount of time.
to avoid full scans i've added derived indices that can be used to pick only 
stuff
that is displayed, and utility function that allows iterating across 
multiple indices at once.

 ??>> bits, huh? what storage nowadays works on bit level?

 RLR> You are quite correct that most usage of storage is done at the byte
 RLR> and word level---I am arguing that this is an opportunity for us.  Of
 RLR> course, one must understand that the page will be read and written as
 RLR> an 8K chunk of bytes, which will contain carefully packed bit sets
 RLR> that do not respect byte boundaries.

doesn't this add node addressing problems? how do you point on a specific
node from other pages with such approach?

 RLR> file system that has a cache, and I'm not neglecting the relatively
 RLR> long time it takes for a CPU to retrieve a byte from memory which is
 RLR> not already in the CPU cache.

if we were using carefully optimized assembly language, then probably indeed 
memory latency/bandwidth would be a limiting factor
(however, if gigabytes per second bandwidth is not enough, most likely that 
means that algorithms -- like using each-to-each comparisons -- suck).

but if we are using high-level language like Lisp, it's much more likely 
that CPU overhead will be large comparing to memory latency.
and as compression, even simpiest/fastest one will only add to CPU overhead, 
i'm very skeptical about it's benefits.

by the way, interesting fact: some time ago i've found that even with cache 
enabled in elehant/postmodern
slots reads are relatively slow -- on 10 thousands per second scale or so. 
it was quite surprising
that abusing function is form-slot-key (that is used in CL-SQL too):

(defun form-slot-key (oid name)
  (format nil "~A ~A" oid name))

and even more surprising was to find out that there is no portable and fast 
way to convert interger to string.
so i had to use SBCL's internal function called 
sb-impl::quick-integer-to-string -- apparently SBCL
developers knew about this problem so they made this function for 
themselves.

so i think that subtle effects like I/O bandwidth will be only seen in we'll 
hardcore optimize elephant and applications themselves.
but as it is now, storage backend won't be of a big influence of overal 
performance and just needs to be moderately good





More information about the elephant-devel mailing list