[Ecls-list] Latest changes

Matthew Mondor mm_lists at pulsar-zone.net
Mon Mar 19 04:51:13 UTC 2012


On Mon, 19 Mar 2012 01:20:10 +0100
Juan Jose Garcia-Ripoll <juanjose.garciaripoll at googlemail.com> wrote:

> I have changed the implementation using now a FIFO queue. The times seem to
> improve a lot for short tests, going up to 731 connections / s or 1.4 ms /
> connection on average. For longer tests it degrades a bit and goes up to
> 1.9 ms, which I attribute to consing.
> 
> The queue is based on a spinlock and I believe the FIFO character plus the
> fact that the waiters spinlock with waiting times that are at most 0.1s
> should provide enough of a balance not to make it too unfair. But to be
> honest, I have not done any research on how to make this theoretically
> sound.
> 
> As potential improvements I see:
> * Changing the queue so that it does not cons (perhaps with a "next" field
> in the process object itself)
> * The queue has the format multiple produces - one consumer, meaning that
> it can be implemented without a lock (just CAS).

I gave a first try at the new implementation.  At first all I could
notice was the performance impact VS the last stable release using
pthreads, although I couldn't say that it's crawling.  Testing
stability was more important for me, so I ran a number of ab(8) runs.

At first all was fine for a little while, although I could notice that
the speed was varying a lot between runs.  Sometimes a 500 connections
run finished immediately, at other times I had to wait a few seconds.
This also occurred with the stable release if I used runs of 5000, but
generally not with 500.

Then I noticed that several threads started taking a lot of CPU time,
and that the REPL was quite less interactive (with delays).  I used gdb
to find out why and noticed that some threads were on an error path,
catched by my custom debugger hook.  And indeed I could see that the
in-memory fifo log was filling with EBADF errors for accept(2).

I suspect that boehm-gc could be at fault.  I only did the tests on
NetBSD so far, but also intend to try on Linux.  These tests were also
using the new NetBSD-6 TLS support, I'll also try with
--with-__threads=no.

I'll leave the httpd running a while without doing much stress-testing,
as even with the release sometimes some threads get stuck in endless
loops, I want to see if that happens and use gdb to see where it gets
stuck if so.
-- 
Matt




More information about the ecl-devel mailing list