[elephant-devel] run-elephant-thread

Ian Eslick eslick at csail.mit.edu
Sat Jan 20 21:54:42 UTC 2007


There should be an e-mail about this in the archive somewhere, but to  
summarize...

I think that requiring client code to worry about wrapping every  
thread in run-elephant-thread is unrealistic, so that interface is  
now deprecated.  The reason for this is simple, when you are using a  
multithreaded web server the main thread launches client threads  
inside the server code and there is no way to wrap it in an elephant  
thread without modifying the web server and this is unrealistic.

A Thread-Safe Serializer:

The answer to this problem is a little more important than the  
others.  The serializer is called all the time and is a performance  
critical part of the system.  The serializer is thread safe except  
for it's use of a hash table used to detect circular data  
structures.  I've used elephant in a multi-threaded setting for quite  
some time without worrying about the serializer because in practice I  
would never hit the case where an object I was serializing would  
accidentally lookup a duplicate object/id in the circularity hash.   
The way to avoid this case in general is to have a queue of hash  
tables that each thread can grab when it wants to serialize an object  
(you don't want to allocate a hash table in an inner loop - reuse by  
clearing is also costly, but about 50% as much).  So only this queue  
needs to be protected.

I tried using standard locks in the various EXCL-like extended  
packages but the performance is atrocious for frequently called  
routines like serialize.  Instead, I use without-interrupts (a common  
lisp primitive) to block interrupts for the duration of a vector-pop  
command that grabs a new hash.  This gives much better performance.   
The only other variables that need to be so protected are:

Thread Safety in Backends:

I'm pretty sure that the current behavior of BDB is thread-safe.  I  
researched this earlier, but only remember that I concluded it was  
safe so if anyone remembers the details feel free to contradict me.

A quick Google investigation says that CL-SQL requires, at a minimum,  
that each thread have it's own connection object to be thread safe,  
so each thread needs to reconnect to the cl-sql database plus have a  
thread-local clsql:*default-database* binding.  (I don't think this  
works for SQLite though)

These are pretty easy and will be handled in my next checkin:
-------------------------------------------------------------

Global variables (infrequently written):
*elephant-controller-init*
*dbconnection-spec*

Store-controller slots that need infrequent write-protection:
- instance-cache
- symbol-cache (0.6.1+)


The following elephant variables are a little tricky:
-----------------------------------------------------

Thread-local global vars (frequently accessed):
*store-controller* (if different threads use different controllers)
*transaction-stack*
*current-transaction*
*resourced-byte-spec*
(errno handling in uffi?)

Deprecated thread-local vars:
*auto-commit* (BDB 4.4 no longer pays attention to auto-commit  
arguments so we can remove this from elephant)

1) You can use the macro with-elephant-variables in 0.6.1 to create  
new, thread-local dynamic bindings of the above variables, but that  
is still a manual solution for when you have access to the thread  
creation code and can create thread-local specials.

2) A more consequential is to excise required dependency on these  
variables entirely.  The implication of this is a potentially  
significant API change where an application can always provide the  
store controller on calls to collection accessors, cursor operations,  
etc and it defaults to *store-controller* for environments where  
there is only one store or where the user is managing the binding of  
*store-controller* in each thread.  I think this is already  
accommodated in much of the API, but I haven't investigated this to  
see how much work it is.

We can further require that all transactions be wrapped in 'with- 
elephant-transaction' so that the appropriate specials are  
dynamically bound within the stack.  I think this actually would be  
pretty easy.  We could document the internals of with-elephant- 
transaction for anyone who wants to do something sophisticated and is  
willing to manage the thread issues themselves.

Does anyone have a better suggestion here?  For example is there a  
portability layer that can detect the current thread ID and use that  
to index the default global values?

Regards,
Ian

On Jan 20, 2007, at 2:42 PM, Gábor Melis wrote:

> The fine manual at
> http://common-lisp.net/project/elephant/doc/Threading.html says that
> run-elephant-thread is broken but leaves the consequences to my
> imagination.
>
> Considering the comment about specials for buffers and such as  
> well, my
> reading is that there is no official way of using elephant from
> multithreaded code.
>
> Is that right?
> What needs to be done to make run-elephant-thread safe again?
>
> Gabor Melis
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list