[elephant-devel] Re: Postmodern, Act II
Alex Mizrahi
killerstorm at newmail.ru
Sat May 3 11:31:28 UTC 2008
??>> but if you do not start your transactions explicitly, enclosing as
??>> many operations as posible, global-sync-cache absolutely makes no
??>> sense -- it takes more effort to synchornize changes than to actually
??>> load value from database, if that's just a single value. so, maybe, if
??>> cache is set into global sync mode, it should signal error if there is
??>> no explicit transactions -- because that would be misuse of
??>> global sync cache, leading to significant overhead.
LPP> Can you explain this in a bit more detail?
"global sync cache" works by tracking changes made to btrees in the
database -- each write to btree is also written into update_log table.
then, at start of each transaction (or more preciously, before first actuall
btree read/write operation) cache gets synchronized -- basically it pulls
log of all changes since last update, and invalidates cache entries
according to what it have read from DB. additionally it does some bookeeping
for change tracking.
thus, global sync cache only makes sense if you do many (hundreds) database
reads in each transaction. if you don't have such situation, don't use it :)
??>> or you think it makes sense to allow such behaviour? it might make
??>> sense in REPL, for example..
LPP> I put transactions only in an explicit transaction block if it
LPP> makes sense to me, i.e. if there are several successive operations.
LPP> Why would I put a single operation into a WITH-TRANSACTION block?
LPP> It clutters the code.
this cache mode (and postmodern backend in general) is oriented on
webserver-like workload -- each web request always is wrapped into
transaction. if request does no DB activity, that's OK -- starting txn
overhead is not that significant on scale of typical HTTP request time. but
many requests reads lots of values from database -- on thousands scale --
and sync cache makes big difference for this case.
even without cache, there is considerable overhead when doing single read
outside transaction -- at minimal, postmodern will do BEGIN and COMMIT,
which require roundtrips to server, so we have something like 3x overhead
here.
if we were optimizing for standalone read statements, we could try relying
on postgresql implicit transactions -- but that will significantly
complicate logic, so we don't use this.
but while BEGIN/COMMIT is inevitable evil, cache synchonization overhead can
be avoided if not needed, so i thought it's worth giving some kind of
warning in case people are using backend in sub-optimal mode
More information about the elephant-devel
mailing list