[elephant-devel] db-postmodern: performance considerations

Sat Dec 8 11:08:32 UTC 2007

hello

unfortunately rewrite of pm-cursor implementation is postponed for now, it 
turned to be more complex than it seemed originally. and i do not have test 
cases anyway..
however i've made a patch that improves performance of 
get-instances-by-value (i've sent it to Henrik so it should be available 
when he'll synchronize with upstream).

so i'll describe profiling results here -- what's fast and what's not with 
postmodern backend.

1. individual operation in btree -- setting and getting value -- are 
converted directly into SQL queries, and so they are quite fast (as query in 
indexed SQL table can be). however, there's considerable client/server 
communication overhead (system calls, TCP/IP stack..), so making tons of 
individual sets/gets is not very fast.

however, it's possible to cache get operations on client with postmodern 
backend.
two caching options are available -- per-transaction-cache caches only 
inside one transaction.
global-sync-cache caches data between transactions, synchronizing it with 
other working instances -- it tracks changes and invalidates cache entries 
accordingly. synchronization brings some overhead, and thus it's good only 
for certain types of workload (mostly-read).

it's possible to implement global-cache for single-instance mode, that will 
do no synchronization, but this mode is not implemented because 
db-postmodern focuses of safety of use with multiple instances working with 
database simultaneously.

get-instance-by-value, after a patches suggested by Alain, uses btree 
get-value, and thus is cached too.

there's gotcha when using caches with large data sets -- cache entries 
doesn't get garbage collected, so it's possible to run out of memory. it's 
possible to use weak hash table in this case (see make-backend-cache in 
pm-cache, no configuration option), but i don't know how good caching will 
be with it. patch for some smarter solution is welcome.

2. get-instances-by-value, with my specialized map-index implementation, 
uses SQL query to retrieve data directly, so it should be pretty fast. 
however it doesn't use caching.

it doesn't uses cursor, so all instances are always returned, it may be a 
problem if there's a lot of instances for some key.

3. cursors: in general, they do not work very well for now.

first of all, they do not always return values in order as specified by 
"lisp sorter". moreover, if you do not have keys of same type (either all 
integers or strings) you get a random order. (with integers and strings you 
get order according to SQL comparison rules). we are not going to fix this 
issue (besides, maybe, allowing NILs to be mixed with integers or strings 
safely), since we believe it's more important to support good performance in 
practical cases (having all keys of same type). we could implement some 
emulation mode that will retrieve all data and sort it on lisp side, but 
probably it's better to use CLSQL backend if "lisp sort" ordering is a 
requirement of an application.

then, probably the only thing that cursors are good at is iterating all the 
sequence from start to the end.

iterating some limited set of values from start *almost* works fine. to do 
it efficiently on large table it's required to configure PostgreSQL --  
disable hashjoin and mergejoin, otherwise it thinks that it's faster to 
process whole table rather than doing it incrementally. processing table 
with 10000 items costs 60-70 milliseconds. (additionally i suspect some 
small patch to db-postmodern is required to build correct indices for 
cursors to work incrementally).

iterating from end doesn't work good -- it scans through all table (probably 
even twice) due to bug in cursor implementation. this should be more-or-less 
easily fixable, though.

cursor-set performance depends on how far is key you search for from the 
start of the table. (!). it's counting on postgresql side, though, so for 
10000 items table it should take about 70 msecs in worst case. multiple 
cursor-set calls will get equally slow all.

i have some ideas how to fix cursor implementation so it will not depend on 
size of table (to the extent PostgreSQL does not depend, of course), but i 
don't know when i'll have a chance to implement it. as i've mentioned, i've 
currently do not even have a practical test case where cursor iteration is 
used.

4. get-instances-by-range uses cursors, and so it inherits all it's problems

5. psets: default implementation of psets makes instance of btree for each 
of them. in db-postmodern btree has it's own SQL table, so it's not a good 
idea to have thousands of psets. although they might be quite useful to have 
in big amounts, so probably we'll invent something better in future for 
them.

with best regards, Alex 'killer_storm' Mizrahi.