[iolib-devel] c10k HTTP server with iolib

Attila Lendvai attila.lendvai at gmail.com
Fri Nov 6 20:30:06 UTC 2009

> Here is a possible architecture for a server that can handle tons of
> connections at once:

this is all fine, but the interesting question comes when you consider
how to switch between the threads.

what you need is essentially green-threads (much cheaper compared to
OS threads): http://en.wikipedia.org/wiki/Green_threads

the naive solution is to implement the inversion of control by hand,
having hand-written state machines, etc all around. this makes the
code uglier, and through that helps for bugs creeping in.

another solution is to implement green threads.

and yet another solution is to use a currently available call/cc
implementation (cl-cont or hu.dwim.delico). delimited continuations
provide more than green threads, and therefore mean more overhead, but
the difference shouldn't be big. unfortunately for now delico only
provides interpreted continuations (slow), and cl-cont had quite a few
issues when i tried it.

but using cl-cont i've made a proof of concept implementation. the
idea in short:

keep a connection data structure for each connection

set sockets to non-blocking

provide call/cc versions of socket (gray-stream) reading primitives
which when detect that the socket is dry or flooded then store the
current continuation in the connection structure and mark what event
we are waiting for

and then continue (using call/cc primitives) connections based on what
the event handler layer tells us. please note that call/cc is only
needed on the code that reads the request. once it's parsed, the rest
of the handler code can be plain CL up to the point when we start
writing the response. if you need some stream processing that does not
buffer the response before sending, then you need to keep all the
handler code inside the delimited continuation borders.

it's somewhere on my TODO to make hu.dwim.wui work based on that experiment.

currently the wui codebase can process some 3-4 k requests per sec on
my 2.4 GHz core2 laptop, which is good enough for me, especially that
i didn't pay too much attention to performance. but for now it can
only cope with about 50 parallel requests a second, because workers
are not multiplexed. if there are many users with slow network
connections then it can be an issue...

the proof of concept code is lying somewhere in an iolib branch here,
but it's most probably badly bitrotten. it's about a year old now and
iolib went much ahead.


More information about the iolib-devel mailing list