[elephant-devel] run-elephant-thread
Ian Eslick
eslick at csail.mit.edu
Sat Jan 20 21:54:42 UTC 2007
There should be an e-mail about this in the archive somewhere, but to
summarize...
I think that requiring client code to worry about wrapping every
thread in run-elephant-thread is unrealistic, so that interface is
now deprecated. The reason for this is simple, when you are using a
multithreaded web server the main thread launches client threads
inside the server code and there is no way to wrap it in an elephant
thread without modifying the web server and this is unrealistic.
A Thread-Safe Serializer:
The answer to this problem is a little more important than the
others. The serializer is called all the time and is a performance
critical part of the system. The serializer is thread safe except
for it's use of a hash table used to detect circular data
structures. I've used elephant in a multi-threaded setting for quite
some time without worrying about the serializer because in practice I
would never hit the case where an object I was serializing would
accidentally lookup a duplicate object/id in the circularity hash.
The way to avoid this case in general is to have a queue of hash
tables that each thread can grab when it wants to serialize an object
(you don't want to allocate a hash table in an inner loop - reuse by
clearing is also costly, but about 50% as much). So only this queue
needs to be protected.
I tried using standard locks in the various EXCL-like extended
packages but the performance is atrocious for frequently called
routines like serialize. Instead, I use without-interrupts (a common
lisp primitive) to block interrupts for the duration of a vector-pop
command that grabs a new hash. This gives much better performance.
The only other variables that need to be so protected are:
Thread Safety in Backends:
I'm pretty sure that the current behavior of BDB is thread-safe. I
researched this earlier, but only remember that I concluded it was
safe so if anyone remembers the details feel free to contradict me.
A quick Google investigation says that CL-SQL requires, at a minimum,
that each thread have it's own connection object to be thread safe,
so each thread needs to reconnect to the cl-sql database plus have a
thread-local clsql:*default-database* binding. (I don't think this
works for SQLite though)
These are pretty easy and will be handled in my next checkin:
-------------------------------------------------------------
Global variables (infrequently written):
*elephant-controller-init*
*dbconnection-spec*
Store-controller slots that need infrequent write-protection:
- instance-cache
- symbol-cache (0.6.1+)
The following elephant variables are a little tricky:
-----------------------------------------------------
Thread-local global vars (frequently accessed):
*store-controller* (if different threads use different controllers)
*transaction-stack*
*current-transaction*
*resourced-byte-spec*
(errno handling in uffi?)
Deprecated thread-local vars:
*auto-commit* (BDB 4.4 no longer pays attention to auto-commit
arguments so we can remove this from elephant)
1) You can use the macro with-elephant-variables in 0.6.1 to create
new, thread-local dynamic bindings of the above variables, but that
is still a manual solution for when you have access to the thread
creation code and can create thread-local specials.
2) A more consequential is to excise required dependency on these
variables entirely. The implication of this is a potentially
significant API change where an application can always provide the
store controller on calls to collection accessors, cursor operations,
etc and it defaults to *store-controller* for environments where
there is only one store or where the user is managing the binding of
*store-controller* in each thread. I think this is already
accommodated in much of the API, but I haven't investigated this to
see how much work it is.
We can further require that all transactions be wrapped in 'with-
elephant-transaction' so that the appropriate specials are
dynamically bound within the stack. I think this actually would be
pretty easy. We could document the internals of with-elephant-
transaction for anyone who wants to do something sophisticated and is
willing to manage the thread issues themselves.
Does anyone have a better suggestion here? For example is there a
portability layer that can detect the current thread ID and use that
to index the default global values?
Regards,
Ian
On Jan 20, 2007, at 2:42 PM, Gábor Melis wrote:
> The fine manual at
> http://common-lisp.net/project/elephant/doc/Threading.html says that
> run-elephant-thread is broken but leaves the consequences to my
> imagination.
>
> Considering the comment about specials for buffers and such as
> well, my
> reading is that there is no official way of using elephant from
> multithreaded code.
>
> Is that right?
> What needs to be done to make run-elephant-thread safe again?
>
> Gabor Melis
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list