Hi Hans,<br><br>I though this was the problem because of the error message I find in my *standard-error* log:<br>Error while creating worker thread for new incoming connection: Socket error in "getpeername": ENOTCONN (Transport endpoint is not connected)<br>
However, I noticed reading your patches, this is the error message introduced by the patch.<br>Since the error message is present many times, I think your code is working, otherwise the acceptor would crash instantly.<br>
<br>Currently, the symptoms I saw are these:<br>- Timeout when trying to reach port 80 (on which Hunchentoot is listening)<br>- No error message in either *standard-error* nor *standard-output* logs<br>- Only one thread (hunchentoot listener?) running in addition to the main process thread (as shown by htop)<br>
<br>This problem occurs in production environment after a few hours of operations. I restarted it once again with a swank server this time.<br>Next time the problem occurs, I will try to dig in it to gather more informations on the state of my software.<br>
<br>I still don't know what is causing the problem or how to reproduce it, but I will try to check it up and keep you posted.<br><br>Thanks for your time and help.<br>Mathieu.<br><br><div class="gmail_quote">2012/8/15 Hans Hübner <span dir="ltr"><<a href="mailto:hans.huebner@gmail.com" target="_blank">hans.huebner@gmail.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Mathieu,<br>
<br>
I have mined the Subversion repository for the commits that you<br>
referenced, and I found them to represent this patch:<br>
<br>
Index: taskmaster.lisp<br>
===================================================================<br>
--- taskmaster.lisp (revision 4427)<br>
+++ taskmaster.lisp (revision 4435)<br>
@@ -127,9 +127,21 @@<br>
<br>
#-:lispworks<br>
(defmethod handle-incoming-connection ((taskmaster<br>
one-thread-per-connection-taskmaster) socket)<br>
- (bt:make-thread (lambda ()<br>
- (process-connection (taskmaster-acceptor<br>
taskmaster) socket))<br>
- :name (format nil "Hunchentoot worker \(client:<br>
~A)" (client-as-string socket))))<br>
+ ;; We are handling all conditions here as we want to make sure that<br>
+ ;; the acceptor process never crashes while trying to create a<br>
+ ;; worker thread. One such problem exists in<br>
+ ;; GET-PEER-ADDRESS-AND-PORT which can signal socket conditions on<br>
+ ;; some platforms in certain situations.<br>
+ (handler-case<br>
+ (bt:make-thread (lambda ()<br>
+ (process-connection (taskmaster-acceptor<br>
taskmaster) socket))<br>
+ :name (format nil "Hunchentoot worker \(client:<br>
~A)" (client-as-string socket)))<br>
+<br>
+ (error (cond)<br>
+ ;; Need to bind *ACCEPTOR* so that LOG-MESSAGE can do its work.<br>
+ (let ((*acceptor* (taskmaster-acceptor taskmaster)))<br>
+ (log-message *lisp-errors-log-level*<br>
+ "Error while creating worker thread for new<br>
incoming connection: ~A" cond)))))<br>
<br>
;; LispWorks implementation<br>
<br>
Now, the patch actually does not apply, but it was on trunk and has<br>
been evolved into the current version:<br>
<br>
<a href="https://github.com/edicl/hunchentoot/blob/master/taskmaster.lisp#L355" target="_blank">https://github.com/edicl/hunchentoot/blob/master/taskmaster.lisp#L355</a><br>
<br>
If I understand you correctly, you say that the code does not work.<br>
Can you somehow provide a way to reproduce the problem?<br>
<br>
Thanks,<br>
Hans<br>
<div><div class="h5"><br>
On Wed, Aug 15, 2012 at 3:29 PM, Mathieu Lemoine <<a href="mailto:mlemoine@mentel.com">mlemoine@mentel.com</a>> wrote:<br>
> Hi,<br>
><br>
> I am currently hitting the problem described in the following thread:<br>
> <a href="http://lists.common-lisp.net/pipermail/tbnl-devel/2009-July/004768.html" target="_blank">http://lists.common-lisp.net/pipermail/tbnl-devel/2009-July/004768.html</a><br>
><br>
> I'm using SBCL 1.0.58 with hunchentoot 1.2.3 and usocket 0.5.5 (both from<br>
> quicklisp).<br>
><br>
> I understand the long-term fix will be some hard work. However, in this<br>
> thread, Hans is referencing a few diff hunks :<br>
><br>
> <a href="http://bknr.net/trac/changeset/4430?format=diff&new=4430" target="_blank">http://bknr.net/trac/changeset/4430?format=diff&new=4430</a><br>
> <a href="http://bknr.net/trac/changeset/4434?format=diff&new=4434" target="_blank">http://bknr.net/trac/changeset/4434?format=diff&new=4434</a><br>
> <a href="http://bknr.net/trac/changeset/4435?format=diff&new=4435" target="_blank">http://bknr.net/trac/changeset/4435?format=diff&new=4435</a><br>
><br>
> Since the old trac is not working anymore, could it be possible to either<br>
> point me toward the equivalent commits in the github repository or the<br>
> content of these patches.<br>
><br>
> By the way, Peter, if you're still around, did you need to apply only 4435<br>
> or both this and 4434 in order to solve the second issue (unbound variable<br>
> on *acceptor*)?<br>
><br>
> If anybody has information with respect to these issues, including the<br>
> corresponding bug report in usocket, so I can do some follow up with them,<br>
> it would be greatly appreciated.<br>
><br>
> Thanks in advance for your help and time.<br>
><br>
> Mathieu.<br>
><br>
</div></div>> _______________________________________________<br>
> tbnl-devel site list<br>
> <a href="mailto:tbnl-devel@common-lisp.net">tbnl-devel@common-lisp.net</a><br>
> <a href="http://common-lisp.net/mailman/listinfo/tbnl-devel" target="_blank">http://common-lisp.net/mailman/listinfo/tbnl-devel</a><br>
<br>
_______________________________________________<br>
tbnl-devel site list<br>
<a href="mailto:tbnl-devel@common-lisp.net">tbnl-devel@common-lisp.net</a><br>
<a href="http://common-lisp.net/mailman/listinfo/tbnl-devel" target="_blank">http://common-lisp.net/mailman/listinfo/tbnl-devel</a><br>
</blockquote></div><br>