[elephant-devel] Re: Updated version of last Postmodern patch bundle
Alex Mizrahi
killerstorm at newmail.ru
Fri Mar 21 12:49:35 UTC 2008
i have several notes about concurrent tests:
1. threaded-idx-access and provoke-deadlock are essentially same -- they
change slot values of zork instances.
there are some differences: threaded-idx-access runs multiple test
batches, provoke-deadlock works in one batch.
provoke-deadlock runs offending number of threads -- 30, instead of 5 in
theaded-idx-acess.
provoke-deadlock concentrates on first object, while threaded-idx-access
works on all of them.
i do not think any of these differences are essential. having multiple
slightly different tests hoping they will spot errors is like "voodoo
programming", i think we should delete provoke-deadlock.
or rewrite it in a way it will be really different. (classic way to
provoke deadlock is to do criss-cross slot updates).
2. provoke-deadlock does not wait until threads are finished before it wipes
classes and closes controller. obviously, this causes weird errors.
is it intentianal behaviour?
3. threaded-idx-access joins to all threads it finds. i think it's not OK
because there might be threads spawned by SLIME etc.
it's better to collect threads created and join to those ones we
created, i.e.
(mapcar #'sb-thread:join-thread
(loop for i from 1 to 5
collect (let ((i i))
(bt:make-thread (lambda () ...)))
also note (let ((i i)) -- this is essential, otherwise all closures will
refer to same i binding.
4. join-thread seems to be SBCL-specific. can't we find some thread-synch
primitive that will work on all implementations? (i haven't yet looked into
this).
5. threaded-idx-access is not really an automatic test -- if something gets
broken, it lands into a debugger.
i think that there should be something like handler-case inside each
thread, and if error happens inside the thread, it should report it to main
thread, that will re-throw it, or something.
also, it's worth checking that operations were correct -- classes are
accessible via index, slot values are correct etc.
now, about test runs. indeed, it yields lots of _different_ deadlocks in
isolation mode "read commited". most bizzare one i've seen was result of
interaction of four(!) threads, however there were ones with mere two
participants.
OTOH results of running this with serializable isolation mode are quite
consistent:
WARNING:
Error while executing prepared statement "TREE12DELETE-BOTH" (params: (0
433)).
retrying txn due to:Database error 40001: could not serialize access due to
concurrent update
conclusions are clear to me: use serializable isolation mode FTW.
it's not worth trying to fix working in "read committed" mode, because there
is no logical grounds for it to be working fine when data can be changed at
any time and it not consistent within a single transaction.
i think it's better to completely ban "read committed" mode, but i can make
this configurable for people with masochistic intentions :).
as i have ideas how to "improve" concurrent test suite, probably it would be
better if i'll just take over the suite and implement them. however, i'm
open about other ideas about future of this test suite :).
also i have a question unrelated to test suite: it says me that
thread-alive-p used in reap-orphaned-connections in db-postmodern is not
defined. where should this function come from, newer version of bordeax
threads (perhaps i have an outdated package)?
More information about the elephant-devel
mailing list