From ctdean at sokitomi.com Sat Feb 3 02:27:34 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Fri, 02 Feb 2007 18:27:34 -0800
Subject: [drakma-devel] Closing streams with :want-stream t
Message-ID:
I have a file handle leak in my code (that is, I run out of file
handles after running for a while) and I suspect that I'm not using
drakma correctly.
I wish to use the :want-stream t parameter and read the resulting
stream directly, so I have this code:
(defun simple-get (url)
"Download the url using GET and return the body as a string."
(handler-case
(multiple-value-bind (stream code headers dummy-uri dummy-stream
must-close?)
(drakma:http-request url :want-stream t :keep-alive nil
:method :get)
(declare (ignore headers dummy-stream dummy-uri))
(unwind-protect
(and stream
code
(= code 200)
(with-output-to-string (out)
(do ((ch (read-char stream nil :eof)
(read-char stream nil :eof)))
((not (characterp ch)))
(princ ch out))))
(when (and stream must-close?)
(ignore-errors (close stream)))))
(error (condition)
(format t "Error ~s: ~a~%" url condition)
nil)))
Is there anything extra I need to do to make sure that all the streams
opened by drakma are closed?
My production code is much more complex, but the simple stub above
will generate the out of file handles problem. Besides the actual
error I can use lsof on Linux and Mac OS and see many sockets stuck in
CLOSED or CLOSE_WAIT states. This is all under LispWorks 5.0.1
Also, when using :want-stream nil I never encounter the problem.
Cheers,
Chris Dean
From vodonosov at mail.ru Sat Feb 3 13:33:28 2007
From: vodonosov at mail.ru (Anton Vodonosov)
Date: Sat, 03 Feb 2007 15:33:28 +0200
Subject: [drakma-devel] Closing streams with :want-stream t
In-Reply-To:
References:
Message-ID: <45C48F28.1020302@mail.ru>
Hi, Dean.
I've tried your code, but I can't reproduce
socket handle leak. I'm on Windows + Clisp.
I've made several calls to
(SIMPLE-GET "http://microsoft.com")
and (SIMPLE-GET "http://google.com"); and
watching sockets using netstat. All sockets
are closed properly.
What URLs lead to socket handle leak?
May it be that URLs you use point to servers
that use http 1.1 but don't return close http
header properly?
As far as I understand, MUST-CLOSE? = NIL
means that stream may be reused in further
calls of HTTP-REQUEST for the same server.
If so and your are not intended to resuse
stream in further calls, you can always CLOSE
it.
Try to always CLOSE returned streams, without
regard to MUST-CLOSE?.
Regards,
-Anton
From edi at agharta.de Sat Feb 3 15:09:48 2007
From: edi at agharta.de (Edi Weitz)
Date: Sat, 03 Feb 2007 16:09:48 +0100
Subject: [drakma-devel] Closing streams with :want-stream t
In-Reply-To: (Chris Dean's message of
"Fri, 02 Feb 2007 18:27:34 -0800")
References:
Message-ID:
On Fri, 02 Feb 2007 18:27:34 -0800, Chris Dean wrote:
> I have a file handle leak in my code (that is, I run out of file
> handles after running for a while) and I suspect that I'm not using
> drakma correctly.
>
> I wish to use the :want-stream t parameter and read the resulting
> stream directly, so I have this code:
>
> (defun simple-get (url)
> "Download the url using GET and return the body as a string."
> (handler-case
> (multiple-value-bind (stream code headers dummy-uri dummy-stream
> must-close?)
> (drakma:http-request url :want-stream t :keep-alive nil
> :method :get)
> (declare (ignore headers dummy-stream dummy-uri))
> (unwind-protect
> (and stream
> code
> (= code 200)
> (with-output-to-string (out)
> (do ((ch (read-char stream nil :eof)
> (read-char stream nil :eof)))
> ((not (characterp ch)))
> (princ ch out))))
> (when (and stream must-close?)
> (ignore-errors (close stream)))))
> (error (condition)
> (format t "Error ~s: ~a~%" url condition)
> nil)))
>
> Is there anything extra I need to do to make sure that all the
> streams opened by drakma are closed?
>
> My production code is much more complex, but the simple stub above
> will generate the out of file handles problem. Besides the actual
> error I can use lsof on Linux and Mac OS and see many sockets stuck
> in CLOSED or CLOSE_WAIT states. This is all under LispWorks 5.0.1
>
> Also, when using :want-stream nil I never encounter the problem.
The meaning of the sixth return value (MUST-CLOSE) is that you're not
allowed to re-use the stream, because according to the reply headers
the server will close the stream on its side.
However, if you do /not/ want to re-use the stream (which is obviously
the case in your example as your function doesn't return the stream),
you must of course always close it. Drakma can't close it for you as
it doesn't know when you're done with it, and why would you want to
keep an open stream hanging around in your image that can't be
accessed by your code anyway?
In other words: It should be
(when stream
(ignore-errors (close stream)))))
above.
I'll re-word this in the documentation to make it more clear
(hopefully).
HTH,
Edi.
From edi at agharta.de Sat Feb 3 15:37:48 2007
From: edi at agharta.de (Edi Weitz)
Date: Sat, 03 Feb 2007 16:37:48 +0100
Subject: [drakma-devel] Closing streams with :want-stream t
In-Reply-To: (Edi Weitz's message of "Sat, 03 Feb
2007 16:09:48 +0100")
References:
Message-ID:
On Sat, 03 Feb 2007 16:09:48 +0100, Edi Weitz wrote:
> I'll re-word this in the documentation to make it more clear
> (hopefully).
Although, after reading it once more, I think the documentation was
already pretty clear:
HTTP-REQUEST will always close the stream to the server before it
returns unless WANT-STREAM is true or if the headers exchanged
between Drakma and the server determine that the connection will be
kept alive - for example if both client and server used the HTTP 1.1
protocol and no explicit "Connection: close" header was sent. In
these cases /you/ will have to close the stream manually.
[...]
If WANT-STREAM is true, the message body is not read and instead the
(open) socket stream is returned as the first return value. If the
sixth value of HTTP-REQUEST is true, the stream should be closed
(and not be re-used) after the body has been read.
Anyway, I'll try to be even more precise... :)
From ctdean at sokitomi.com Sat Feb 3 20:42:51 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Sat, 03 Feb 2007 12:42:51 -0800
Subject: [drakma-devel] Closing streams with :want-stream t
In-Reply-To: <45C48F28.1020302@mail.ru> (Anton Vodonosov's message of "Sat,
03 Feb 2007 15:33:28 +0200")
References: <45C48F28.1020302@mail.ru>
Message-ID:
Anton Vodonosov writes:
> May it be that URLs you use point to servers that use http 1.1 but
> don't return close http header properly?
The problem is very data dependent and I have a test set of 1647 urls
that exercises the leak. I can send the data in a private email if
you wish.
Thanks for running a test on Windows.
Cheers,
Chris Dean
From ctdean at sokitomi.com Sat Feb 3 21:10:31 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Sat, 03 Feb 2007 13:10:31 -0800
Subject: [drakma-devel] Closing streams with :want-stream t
In-Reply-To: (Edi Weitz's message of "Sat,
03 Feb 2007 16:09:48 +0100")
References:
Message-ID:
Edi Weitz writes:
> However, if you do /not/ want to re-use the stream (which is obviously
> the case in your example as your function doesn't return the stream),
> you must of course always close it.
Sure, of course.
> (when stream
> (ignore-errors (close stream)))))
Fair enough. FWIW, my production code looks exactly like this.
(During my debugging I noticed that must-close was always t in my case.)
Regardless, if I make that change I still see the leak.
I have a data set I can send off-list if anyone is interested.
Cheers,
Chris Dean
(defun simple-get (url)
"Download the url using GET and return the body as a string."
(handler-case
(multiple-value-bind (stream code)
(drakma:http-request url :want-stream t :keep-alive nil :method :get)
(unwind-protect
(and stream
code
(= code 200)
(with-output-to-string (out)
(do ((ch (read-char stream nil :eof)
(read-char stream nil :eof)))
((not (characterp ch)))
(princ ch out))))
(when stream
(ignore-errors (close stream)))))
(error (condition)
(format t "Error ~s: ~a~%" url condition)
nil)))
From edi at agharta.de Sun Feb 4 23:25:08 2007
From: edi at agharta.de (Edi Weitz)
Date: Mon, 05 Feb 2007 00:25:08 +0100
Subject: [drakma-devel] Re: drakma/chunga problem.
In-Reply-To:
=?iso-8859-1?q?=28Asbj=F8rn_Bj=F8rnstad's?= message of "Sun,
4 Feb 2007 13:45:25 +0800")
References:
Message-ID:
Hi!
On Sun, 4 Feb 2007 13:45:25 +0800, "Asbj?rn Bj?rnstad" wrote:
> I'm not sure whether this is a bug or not.
[Please use the mailing list to report bugs. See Cc.]
> I'm planning to set up automatic bug reporting into trac.
> (http://trac.edgewall.com)
>
> Posting the message works, but if I try to add a cookie jar, I get
> an error. (The cookie jar is empty before the call, if that is not
> how it's supposed to be used, you can safely ignore this.) As I said
> it's working without cookies, and I don't plan to use cookies, just
> thought you might want to know about a possible bug. This is with
> the latest version of chunga and drakma.
>
> Backtrace attached as I don't know if gmail might break lines and
> make it unreadable (Password changed, if you want a trac account to
> test youself, it can be arranged.)
I think this is the relevant part of the backtrace:
Call to CHUNGA:READ-NAME-VALUE-PAIR (offset 78)
STREAM : #
CHUNGA::VALUE-REQUIRED-P : NIL
CHUNGA::COOKIE-SYNTAX : T
Call to CHUNGA:READ-NAME-VALUE-PAIRS (offset 449)
STREAM : #
CHUNGA::VALUE-REQUIRED-P : NIL
CHUNGA::COOKIE-SYNTAX : T
CHAR : #\;
DBG::|accumulator-| : (NIL ("expires" . "Sat, 05-May-2007 05:11:54 GMT") ("Path" . "/realtist"))
DBG::|aux-var-| : (("Path" . "/realtist"))
Binding frame:
CHUNGA:*CURRENT-ERROR-MESSAGE* : NIL
Catch frame: #
Call to DRAKMA::PARSE-SET-COOKIE (offset 426)
STRING : "trac_session=20ae843edfe4ed8c7a3815ec; expires=Sat, 05-May-2007 05:11:54 GMT; Path=/realtist;"
DBG::OBJ : #
DBG::DESC : (# T :DONT-CARE)
STREAM : #
CHUNGA:*CURRENT-ERROR-MESSAGE* : "While parsing cookie header \"trac_session=20ae843edfe4ed8c7a3815ec; expires=Sat, 05-May-2007 05:11:54 GMT; Path=/realtist;\":"
FIRST : T
DRAKMA::NEXT : #\t
DRAKMA::NAME/VALUE : ("trac_session" . "20ae843edfe4ed8c7a3815ec")
DRAKMA::PARAMETERS : NIL
DBG::|accumulator-| : (NIL)
DBG::|aux-var-| : (NIL)
It seems the problem is the semicolon at the end and that this is a
bug that was fixed in Chunga 0.2.3. Are you sure you're using the
latest version?
http://weitz.de/chunga/CHANGELOG.txt
Cheers,
Edi.
From edi at agharta.de Mon Feb 5 00:22:02 2007
From: edi at agharta.de (Edi Weitz)
Date: Mon, 05 Feb 2007 01:22:02 +0100
Subject: [drakma-devel] New version 0.5.5 (Was: Closing streams with
:want-stream t)
In-Reply-To: (Chris Dean's message of
"Sat, 03 Feb 2007 13:10:31 -0800")
References:
Message-ID:
On Sat, 03 Feb 2007 13:10:31 -0800, Chris Dean wrote:
> Regardless, if I make that change I still see the leak.
>
> I have a data set I can send off-list if anyone is interested.
OK, thanks for the data. I think I've found the leak: It happened if
there was a redirect and the server explicitely wanted to close the
first connection, the one which sent the 302. Drakma realized that it
couldn't re-use the socket stream and created a new one, but it
"forgot" to close the old one.
Please try the new release and see if you still have the same
problems.
Thanks for the bug report,
Edi.
From ctdean at sokitomi.com Mon Feb 5 00:28:56 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Sun, 04 Feb 2007 16:28:56 -0800
Subject: [drakma-devel] New version 0.5.5
In-Reply-To: (Edi Weitz's message of "Mon,
05 Feb 2007 01:22:02 +0100")
References:
Message-ID:
Edi Weitz writes:
> Please try the new release and see if you still have the same
> problems.
I will, and I'll let you know the results of my tests.
Cheers,
Chris Dean
From ctdean at sokitomi.com Mon Feb 5 02:20:05 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Sun, 04 Feb 2007 18:20:05 -0800
Subject: [drakma-devel] New version 0.5.5
In-Reply-To: (Edi Weitz's message of "Mon,
05 Feb 2007 01:22:02 +0100")
References:
Message-ID:
> Please try the new release and see if you still have the same
> problems.
I've tried it out on my data and it is certainly much better. But
there are still some connections left after a run of 1646 urls.
I'll take another look at the code later tonight and see if I can
discover anything. Also, I'll send along my run data in a separate
email to interested parties.
Cheers,
Chris Dean
From ctdean at sokitomi.com Mon Feb 5 07:45:34 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Sun, 04 Feb 2007 23:45:34 -0800
Subject: [drakma-devel] New version 0.5.5
In-Reply-To: (Chris Dean's message of
"Sun, 04 Feb 2007 18:20:05 -0800")
References:
Message-ID:
Chris Dean writes:
> I've tried it out on my data and it is certainly much better. But
> there are still some connections left after a run of 1646 urls.
Some urls give an error when parsing the header. We do hit the final
unwind-protect in http-request, but since the error occurs during the
parsing of the header the caller (me) doesn't have the stream object
available to close. One solution is below: set another flag that
indicates whether or not to leave the stream open.
I'll continue testing in case I come across any other issues.
Cheers,
Chris Dean
-------------- next part --------------
A non-text attachment was scrubbed...
Name: drakma-force-open.patch
Type: text/x-patch
Size: 1469 bytes
Desc: drakma-force-open.patch
URL:
From edi at agharta.de Mon Feb 5 12:31:05 2007
From: edi at agharta.de (Edi Weitz)
Date: Mon, 05 Feb 2007 13:31:05 +0100
Subject: [drakma-devel] Re: drakma/chunga problem.
In-Reply-To:
=?iso-8859-1?q?=28Asbj=F8rn_Bj=F8rnstad's?= message of "Mon,
5 Feb 2007 19:55:34 +0800")
References:
Message-ID:
[Cc to mailing list.]
On Mon, 5 Feb 2007 19:55:34 +0800, "Asbj?rn Bj?rnstad" wrote:
> BTW, I got a bounce back from the mailing list as I am not a member.
> Is that intentional? (Could stop some from submitting bug reports.)
It's for subscribers only as are almost all mailing lists I know.
Yes, you have to subscribe, but I think it's not asking too much if
you want free support for software you didn't pay for. The
alternative would be that the list would be swamped with spam which is
not really an alternative to me. (And I don't like to handle
questions and bug reports off list either, because it very often means
that you have to say the same thing more than once.)
Cheers,
Edi.
From edi at agharta.de Tue Feb 6 00:46:02 2007
From: edi at agharta.de (Edi Weitz)
Date: Tue, 06 Feb 2007 01:46:02 +0100
Subject: [drakma-devel] New version 0.5.5
In-Reply-To: (Chris Dean's message of
"Sun, 04 Feb 2007 18:20:05 -0800")
References:
Message-ID:
On Sun, 04 Feb 2007 18:20:05 -0800, Chris Dean wrote:
> I've tried it out on my data and it is certainly much better. But
> there are still some connections left after a run of 1646 urls.
I've now done a full run through the test URLs you sent and they
provide for a lot of interesting problematic cases. I'll update
Drakma and probably Chunga with bugfixes and/or workarounds in the
next days.
Thanks,
Edi.
From edi at agharta.de Thu Feb 8 14:30:11 2007
From: edi at agharta.de (Edi Weitz)
Date: Thu, 08 Feb 2007 15:30:11 +0100
Subject: [drakma-devel] New Chunga release 0.2.4
Message-ID:
ChangeLog:
Version 0.2.4
2007-02-08
Allow more characters in cookie names/values according to original Netscape spec
Robustified READ-COOKIE-VALUE
Download:
http://weitz.de/files/chunga.tar.gz
Cheers,
Edi.
From edi at agharta.de Thu Feb 8 14:37:10 2007
From: edi at agharta.de (Edi Weitz)
Date: Thu, 08 Feb 2007 15:37:10 +0100
Subject: [drakma-devel] New Drakma release 0.6.0
Message-ID:
ChangeLog:
Version 0.6.0
2006-02-08
Make sure stream is closed in case of early errors (thanks to Chris Dean for test data)
Robustified cookie parsing
Send all outgoing cookies in one fell swoop (for Sun's buggy web server)
Deal with empty Location headers
Deal with corrupted Content-Type headers
Download:
http://weitz.de/files/drakma.tar.gz
Have fun,
Edi.
From edi at agharta.de Thu Feb 8 14:45:10 2007
From: edi at agharta.de (Edi Weitz)
Date: Thu, 08 Feb 2007 15:45:10 +0100
Subject: [drakma-devel] Several fixes and workarounds (Was: New version
0.5.5)
In-Reply-To: (Edi Weitz's message of "Tue, 06 Feb
2007 01:46:02 +0100")
References:
Message-ID:
On Tue, 06 Feb 2007 01:46:02 +0100, Edi Weitz wrote:
> I've now done a full run through the test URLs you sent and they
> provide for a lot of interesting problematic cases. I'll update
> Drakma and probably Chunga with bugfixes and/or workarounds in the
> next days.
OK, see the new releases of Drakma and Chunga. I can now run Drakma
(tested on LWW 5.0.1) through Chris Dean's 1600+ test cases with only
very few warnings and errors. These are:
1. Charsets like GB2312 that FLEXI-STREAMS doesn't know.
2. Headers sent by the server which are really corrupt.
3. Network-related errors like "Unknown host".
4. Five cases of "End of file while reading ...".
I'm only concerned about #4, but unfortunately these aren't
reproducible. I'll see what I can find out, but if someone has an
idea, please step forward.
FWIW, this is the function I used for testing:
(defun simple-get (url)
(handler-case
(let ((puri:*strict-parse* nil)
(flex:*provide-use-value-restart* t)
(flex:*substitution-char* #\?))
(multiple-value-bind (stream code)
(drakma:http-request url
:cookie-jar (make-instance 'drakma:cookie-jar)
:want-stream t)
(unwind-protect
(and stream (eql code 200)
(with-output-to-string (out)
(do ((ch (read-char stream nil :eof)
(read-char stream nil :eof)))
((not (characterp ch)))
(princ ch out))))
(when stream
(ignore-errors (close stream :abort t))))))
(error (condition)
(format t "~&Error (~A): ~A~%%" url condition)
nil)))
Edi.
From ctdean at sokitomi.com Thu Feb 8 19:01:14 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Thu, 08 Feb 2007 11:01:14 -0800
Subject: [drakma-devel] Several fixes and workarounds
In-Reply-To: (Edi Weitz's message of "Thu,
08 Feb 2007 15:45:10 +0100")
References:
Message-ID:
That's great! I'll pull the new versions and run them through my
code.
Cheers,
Chris Dean
From saurabhnanda at gmail.com Fri Feb 9 12:01:18 2007
From: saurabhnanda at gmail.com (Saurabh Nanda)
Date: Fri, 9 Feb 2007 17:31:18 +0530
Subject: [drakma-devel] Not following redirects and conditions
Message-ID: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
Hi,
I'm trying to write some tests for an HTTP based API. I need to check
whether a server responds with a 302 status code and then need to check the
referred location as well, without actually visiting that link.
How is it possible with drakma? If I use :redirect 0 then the http-request
function throws up an error.
TIA
Nandz.
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From edi at agharta.de Fri Feb 9 12:10:30 2007
From: edi at agharta.de (Edi Weitz)
Date: Fri, 09 Feb 2007 13:10:30 +0100
Subject: [drakma-devel] Not following redirects and conditions
In-Reply-To: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
(Saurabh Nanda's message of "Fri, 9 Feb 2007 17:31:18 +0530")
References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
Message-ID:
Hi,
On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda" wrote:
> I'm trying to write some tests for an HTTP based API. I need to
> check whether a server responds with a 302 status code and then need
> to check the referred location as well, without actually visiting
> that link.
>
> How is it possible with drakma? If I use :redirect 0 then the
> http-request function throws up an error.
How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS.
Cheers,
Edi.
From saurabhnanda at gmail.com Fri Feb 9 12:21:37 2007
From: saurabhnanda at gmail.com (Saurabh Nanda)
Date: Fri, 9 Feb 2007 17:51:37 +0530
Subject: [drakma-devel] Not following redirects and conditions
In-Reply-To:
References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
Message-ID: <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com>
Super! It works -- I should probably read the documentation more
carefully the next time around!
Nandz.
On 2/9/07, Edi Weitz wrote:
> Hi,
>
> On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda"
> wrote:
>
> > I'm trying to write some tests for an HTTP based API. I need to
> > check whether a server responds with a 302 status code and then need
> > to check the referred location as well, without actually visiting
> > that link.
> >
> > How is it possible with drakma? If I use :redirect 0 then the
> > http-request function throws up an error.
>
> How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS.
>
> Cheers,
> Edi.
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com
From saurabhnanda at gmail.com Fri Feb 9 13:19:29 2007
From: saurabhnanda at gmail.com (Saurabh Nanda)
Date: Fri, 9 Feb 2007 18:49:29 +0530
Subject: [drakma-devel] Not following redirects and conditions
In-Reply-To: <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com>
References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
<794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com>
Message-ID: <794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com>
If ":redirect nil" is set and the server responds with a Set-Cooke:
header and an HTTP redirect, will the cookie be set?
I noticed that when the header is the following the cookie is set --
"Set-Cookie: cookie-name=some-randome-value"
But when the header is the following, the cookie is not set --
"Set-Cookie: cookie-name="
Is this correct, or is it some bug in my tests?
Regards,
Saurabh.
On 2/9/07, Saurabh Nanda wrote:
> Super! It works -- I should probably read the documentation more
> carefully the next time around!
>
> Nandz.
>
> On 2/9/07, Edi Weitz wrote:
> > Hi,
> >
> > On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda"
>
> > wrote:
> >
> > > I'm trying to write some tests for an HTTP based API. I need to
> > > check whether a server responds with a 302 status code and then need
> > > to check the referred location as well, without actually visiting
> > > that link.
> > >
> > > How is it possible with drakma? If I use :redirect 0 then the
> > > http-request function throws up an error.
> >
> > How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS.
> >
> > Cheers,
> > Edi.
> > _______________________________________________
> > drakma-devel mailing list
> > drakma-devel at common-lisp.net
> > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
> >
>
>
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com
From edi at agharta.de Fri Feb 9 14:19:41 2007
From: edi at agharta.de (Edi Weitz)
Date: Fri, 09 Feb 2007 15:19:41 +0100
Subject: [drakma-devel] Not following redirects and conditions
In-Reply-To: <794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com>
(Saurabh Nanda's message of "Fri, 9 Feb 2007 18:49:29 +0530")
References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com>
<794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com>
<794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com>
Message-ID:
On Fri, 9 Feb 2007 18:49:29 +0530, "Saurabh Nanda" wrote:
> If ":redirect nil" is set and the server responds with a Set-Cooke:
> header and an HTTP redirect, will the cookie be set?
>
> I noticed that when the header is the following the cookie is set --
> "Set-Cookie: cookie-name=some-randome-value"
>
> But when the header is the following, the cookie is not set --
> "Set-Cookie: cookie-name="
>
> Is this correct, or is it some bug in my tests?
The cookie should be set, with an empty string as its value. The
value of :REDIRECT should not affect this.
If it's not set, it's an error in Drakma and I'd be happy if you could
send me a test case to reproduce it.
From lispercat at gmail.com Mon Feb 12 20:44:40 2007
From: lispercat at gmail.com (Andrei Stebakov)
Date: Mon, 12 Feb 2007 15:44:40 -0500
Subject: [drakma-devel] Problem with file uploading
Message-ID:
This is exaclty what I need. The GET method works just fine, but I have
trouble with the POST method uploading the files.
Edi, here is a question (I am not sure if it's the right mailing list to ask
it...)
When I say (this is part of a function, so I use back-quote for parameters):
(drakma:http-request "/some/uri"
:method :post :form-data t
:parameters `(("Name1" . ,name1)
("Name2" . ,name2)
("File" . ,file-name))))
I got an "unknown error" from the remote host. Looks like there is problem
with streaming file contents. I did a little of debugging printing the
content of file buffer (in send-content function) looks like the file is
being open and read, but something happens at the receiving end.
I wonder how can I debug it more.
When I do the same request from the FORM in Firefox everything works.
What debuggind techniques I can try here? (Sorry I am still very new to
Lisp)
Thank you,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From ctdean at sokitomi.com Mon Feb 12 20:53:59 2007
From: ctdean at sokitomi.com (Chris Dean)
Date: Mon, 12 Feb 2007 12:53:59 -0800
Subject: [drakma-devel] Problem with file uploading
In-Reply-To:
(Andrei Stebakov's message of "Mon, 12 Feb 2007 15:44:40 -0500")
References:
Message-ID:
"Andrei Stebakov" writes:
> This is exaclty what I need. The GET method works just fine, but I
> have trouble with the POST method uploading the files.
I've never used POST with drakma. So I can only offer very general
advice.
One way to debug the system is to test against your own server. You
could, for example, use hunchentoot to easily create a test webserver.
Once you have control of the server you can debug both sides of the
problem.
Cheers,
Chris Dean
From edi at agharta.de Mon Feb 12 21:01:50 2007
From: edi at agharta.de (Edi Weitz)
Date: Mon, 12 Feb 2007 22:01:50 +0100
Subject: [drakma-devel] Problem with file uploading
In-Reply-To:
(Andrei Stebakov's message of "Mon, 12 Feb 2007 15:44:40 -0500")
References:
Message-ID:
On Mon, 12 Feb 2007 15:44:40 -0500, "Andrei Stebakov" wrote:
> (drakma:http-request "/some/uri"
> :method :post :form-data t
For file uploads you don't need the ":FORM-DATA T" part.
> :parameters `(("Name1" . ,name1)
> ("Name2" . ,name2)
> ("File" . ,file-name)))
Without knowing what "/some/uri", NAME1, NAME2, and FILE-NAME are this
is hard to say. Is FILE-NAME really a pathname object?
Or maybe the receiving web server can't cope with chunked transfer
encoding (like Apache 1.x)? Then you'll have to add
:CONTENT-LENGTH T
to the call, but note that this will force Drakma to compose the whole
request body in RAM before sending it which might not work for /very/
large files.
HTH,
Edi.
From edi at agharta.de Mon Feb 12 21:04:35 2007
From: edi at agharta.de (Edi Weitz)
Date: Mon, 12 Feb 2007 22:04:35 +0100
Subject: [drakma-devel] Problem with file uploading
In-Reply-To: (Chris Dean's message of "Mon,
12 Feb 2007 12:53:59 -0800")
References:
Message-ID:
On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean wrote:
> One way to debug the system is to test against your own server. You
> could, for example, use hunchentoot to easily create a test
> webserver. Once you have control of the server you can debug both
> sides of the problem.
Of course, this won't help much if Hunchentoot and the /real/ server
behave differently. (See my other email for an example - Hunchentoot
knows how to handle chunked transfer encoding used by clients, Apache
1.x doesn't.)
Another way to debug Drakma it to use *HEADER-STREAM* to see at least
the headers flying by.
http://weitz.de/drakma/#*header-stream*
Or use something like Ethereal (or whatever it is called nowadays).
From lispercat at gmail.com Wed Feb 14 18:56:28 2007
From: lispercat at gmail.com (Andrei Stebakov)
Date: Wed, 14 Feb 2007 13:56:28 -0500
Subject: [drakma-devel] Problem with file uploading
In-Reply-To:
References:
Message-ID:
The headers printed are following:
GET /authentication.getToken.cp?appKey=1234 HTTP/1.1
Host: domain.com
User-Agent: Drakma/0.6.0 (CMU Common Lisp CVS release-19a
19a-release-20040728 + minimal debian patches; Linux; Linux version
2.2.20-idepci (herbert at gondolin) (gcc version 2.7.2.3) #1 Sat Apr 20
12:45:19 EST 2002; http://weitz.de/drakma/)
Accept: */*
Connection: close
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 74
Content-Type: text/xml
Date: Wed, 14 Feb 2007 18:52:53 GMT
Connection: close
POST /image.upload.cp HTTP/1.1
Host: domain.com
User-Agent: Drakma/0.6.0 (CMU Common Lisp CVS release-19a
19a-release-20040728 + minimal debian patches; Linux; Linux version
2.2.20-idepci (herbert at gondolin) (gcc version 2.7.2.3) #1 Sat Apr 20
12:45:19 EST 2002; http://weitz.de/drakma/)
Accept: */*
Connection: close
Content-Type: multipart/form-data;
boundary=----------WueD0PVGvZzxvyK3835D6znnVITzpU5zaysqeYq41qhj1Nlv79
Content-Length: 521
HTTP/1.1 100 Continue
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 94
Content-Type: text/xml; charset=utf-8
Date: Wed, 14 Feb 2007 18:53:01 GMT
Set-Cookie: Coyote-2-c0a8017a=c0a8073f:0;Max-Age=1800;Path=/
Connection: close
((:CACHE-CONTROL . "private") (:CONTENT-LENGTH . "94")
(:CONTENT-TYPE . "text/xml; charset=utf-8")
(:DATE . "Wed, 14 Feb 2007 18:53:01 GMT")
(:SET-COOKIE . "Coyote-2-c0a8017a=c0a8073f:0;Max-Age=1800;Path=/")
(:CONNECTION . "close"))
Maybe, as you mentioned, it's that the server I am trying to upload images
to doesn't understand chunked stream?
Thank you,
Andrew
On 2/12/07, Edi Weitz wrote:
>
> On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean
> wrote:
>
> > One way to debug the system is to test against your own server. You
> > could, for example, use hunchentoot to easily create a test
> > webserver. Once you have control of the server you can debug both
> > sides of the problem.
>
> Of course, this won't help much if Hunchentoot and the /real/ server
> behave differently. (See my other email for an example - Hunchentoot
> knows how to handle chunked transfer encoding used by clients, Apache
> 1.x doesn't.)
>
> Another way to debug Drakma it to use *HEADER-STREAM* to see at least
> the headers flying by.
>
> http://weitz.de/drakma/#*header-stream*
>
> Or use something like Ethereal (or whatever it is called nowadays).
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From lispercat at gmail.com Wed Feb 14 21:11:55 2007
From: lispercat at gmail.com (Andrei Stebakov)
Date: Wed, 14 Feb 2007 16:11:55 -0500
Subject: [drakma-devel] Problem with file uploading
In-Reply-To:
References:
Message-ID:
Hi Edi,
First of all sorry for bringing it to the lisp NG. I didn't want to discuss
it there, I just wanted to hear what libs are available.
For some reason, my gmail client didn't show your messages so the last one
was the one from Chris.
Basically setting :CONTENT-LENGTH T and sending a pathname object instead
of string solve the problem. Now I understand that with :CONTENT-LENGTH nil
it was sending the chunked data. I still don't understand why when I send
the request without :CONTENT-LENGTH T and giving a file name starting with
p# the lisp process hangs (cmucl), maybe it's just the lisp implementation.
Anyway, the problem solved, thank you Edi and Chris!
Andrew
On 2/12/07, Edi Weitz wrote:
>
> On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean
> wrote:
>
> > One way to debug the system is to test against your own server. You
> > could, for example, use hunchentoot to easily create a test
> > webserver. Once you have control of the server you can debug both
> > sides of the problem.
>
> Of course, this won't help much if Hunchentoot and the /real/ server
> behave differently. (See my other email for an example - Hunchentoot
> knows how to handle chunked transfer encoding used by clients, Apache
> 1.x doesn't.)
>
> Another way to debug Drakma it to use *HEADER-STREAM* to see at least
> the headers flying by.
>
> http://weitz.de/drakma/#*header-stream*
>
> Or use something like Ethereal (or whatever it is called nowadays).
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From edi at agharta.de Wed Feb 14 22:23:45 2007
From: edi at agharta.de (Edi Weitz)
Date: Wed, 14 Feb 2007 23:23:45 +0100
Subject: [drakma-devel] Problem with file uploading
In-Reply-To:
(Andrei Stebakov's message of "Wed, 14 Feb 2007 16:11:55 -0500")
References:
Message-ID:
On Wed, 14 Feb 2007 16:11:55 -0500, "Andrei Stebakov" wrote:
> For some reason, my gmail client didn't show your messages so the
> last one was the one from Chris.
Maybe they ended up in the spam folder?
> Basically setting :CONTENT-LENGTH T and sending a pathname object
> instead of string solve the problem. Now I understand that with
> :CONTENT-LENGTH nil it was sending the chunked data.
Good.
> I still don't understand why when I send the request without
> :CONTENT-LENGTH T and giving a file name starting with p# the lisp
> process hangs (cmucl), maybe it's just the lisp implementation.
No, I think that's pretty clear: If the server doesn't understand
chunked encoding, then Drakma will try to send the file (and it'll
only do that if you're using a pathname, i.e. the #P"" syntax), but
the server won't accept it - it doesn't know how much it is supposed
to read from the stream. So, Drakma tries to send data, but on the
other end nobody is reading the data. That's why the whole thing
appears to be hanging. You'll probably get a timeout at some point if
you wait long enough.
From edi at agharta.de Thu Feb 22 18:09:59 2007
From: edi at agharta.de (Edi Weitz)
Date: Thu, 22 Feb 2007 19:09:59 +0100
Subject: [drakma-devel] Darcs repositories
Message-ID:
[My apologies if you get this more than once.]
Several people have asked for Darcs repositories of my software.
These do exists now:
http://common-lisp.net/~loliveira/ediware/
Special thanks to Lu?s Oliveira who made this possible and who
maintains the repositories.
Cheers,
Edi.
From jeffrey at cunningham.net Sat Feb 24 17:07:25 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sat, 24 Feb 2007 09:07:25 -0800
Subject: [drakma-devel] Bug handling bad html?
Message-ID: <20070224170725.GA23865@achilles.olympus.net>
I was playing with drakma and had it drop into the debugger when
retrieving a commercial page. It looks like it might be a bug in
flexi-streams, but I don't know how to isolate the input more
specifically than what came up here:
Unexpected value #xA0 at start of UTF-8 sequence.
[Condition of type FLEXI-STREAMS:FLEXI-STREAM-ENCODING-ERROR]
Restarts:
0: [ABORT] Abort SLIME compilation.
1: [ABORT] Return to SLIME's top level.
2: [TERMINATE-THREAD] Terminate this thread (#)
Backtrace:
0: (FLEXI-STREAMS::SIGNAL-ENCODING-ERROR
#
"Unexpected value #x~X at start of UTF-8 sequence."
160)
1: (FLEXI-STREAMS::SIGNAL-ENCODING-ERROR
#
"Unexpected value #x~X at start of UTF-8 sequence.")
2: ((FLET #:BODY-FN327))
3: ((SB-PCL::FAST-METHOD STREAM-READ-CHAR
(FLEXI-STREAMS::FLEXI-UTF-8-INPUT-STREAM))
#
#
#)
4: ((SB-PCL::FAST-METHOD TRIVIAL-GRAY-STREAMS:STREAM-READ-SEQUENCE
(FLEXI-STREAMS:FLEXI-INPUT-STREAM #1="#<...>" . #1#))
#
#
#
#
#
#)
5: (READ-SEQUENCE
"y make a difference this holiday season. Our gift ideas
are unique and of high quality.
Gift ideas for every occasion, Christmas, Birthday, Mother's day...
Gift ideas for every occasion, Christmas, Birthday, Mothers day, Graduation, Fathers day, Anniversary, Wedding, & Baby Shower.
Hanukkah card, Christmas gift idea and Holiday greeting cards from MixedBlessing
Greeting Cards for Interfaith and Multicultures from MixedBlesing. Hanukkah cards, Holiday cards, Christmas Gift Ideas, Holiday Gifts and more.. Find great gifts now!
..)
6: (DRAKMA::READ-BODY
#
((:DATE . "Sat, 24 Feb 2007 06:30:03 GMT")
(:SERVER . "Apache/2.0.46 (Red Hat)")
(:SET-COOKIE
. "GS_UUID=24.18.193.65.1172298603635841; path=/,PHPSESSID=e009a521cb2bf134a00df925e4f4d510; path=/,cart_hash=e009a521cb2bf134a00df925e4f4d510; expires=Tuesday, 27-Feb-07 06:30:03 GMT; path=/")
(:X-POWERED-BY . "PHP/4.4.0")
(:EXPIRES . "Thu, 19 Nov 1981 08:52:00 GMT")
(:CACHE-CONTROL
. "no-store, no-cache, must-revalidate, post-check=0, pre-check=0") ..))
7: ((LABELS DRAKMA::FINISH-REQUEST) NIL NIL)
8: (HTTP-REQUEST
#
:PROXY
NIL)
9: (RETRIEVE-URI
"http://www.gifttree.com/Christmas/Christmas-gift-idea.html"
NIL)
10: (WALK-SITE
"http://www.gifttree.com/Christmas/Christmas-gift-idea.html"
#
#
#
#
#
#)
11: (SB-FASL::FOP-FUNCALL)
12: (SB-FASL::LOAD-FASL-GROUP
#)
13: (SB-FASL::LOAD-AS-FASL
#
NIL
#)
14: (SB-FASL::INTERNAL-LOAD
#P"/tmp/fileIQGlqR.fasl"
#P"/tmp/fileIQGlqR.fasl"
:ERROR
NIL
NIL
:BINARY
NIL)
15: (SB-FASL::INTERNAL-LOAD
#P"/tmp/fileIQGlqR.fasl"
#P"/tmp/fileIQGlqR.fasl"
:ERROR
NIL
NIL
NIL
:DEFAULT)
16: (LOAD #P"/tmp/fileIQGlqR.fasl")
17: ((LAMBDA (STRING &KEY #1="#<...>" . #1#))
"(print (walk-site \"http://www.gifttree.com\"))
"
:BUFFER
"seo.lisp"
:POSITION
27060
:DIRECTORY
#)
18: ((LAMBDA ()))
--more--
--Jeff
From edi at agharta.de Sat Feb 24 20:47:15 2007
From: edi at agharta.de (Edi Weitz)
Date: Sat, 24 Feb 2007 21:47:15 +0100
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070224170725.GA23865@achilles.olympus.net> (Jeffrey
Cunningham's message of "Sat, 24 Feb 2007 09:07:25 -0800")
References: <20070224170725.GA23865@achilles.olympus.net>
Message-ID:
On Sat, 24 Feb 2007 09:07:25 -0800, Jeffrey Cunningham wrote:
> I was playing with drakma and had it drop into the debugger when
> retrieving a commercial page. It looks like it might be a bug in
> flexi-streams, but I don't know how to isolate the input more
> specifically than what came up here:
>
> Unexpected value #xA0 at start of UTF-8 sequence.
My guess is that the website sends wrong content-type headers. (Or,
in other words, it claims to send UTF-8 but it doesn't.) This is not
unusual. See the mailing list archive of the last weeks for similar
problems and for workarounds.
If you still think this is a bug in FLEXI-STREAMS, send a simple,
reproducible test case and point out where in the sequence of
characters FLEXI-STREAMS thinks it's not UTF-8 anymore although it is.
Thanks,
Edi.
From edi at agharta.de Sat Feb 24 22:20:36 2007
From: edi at agharta.de (Edi Weitz)
Date: Sat, 24 Feb 2007 23:20:36 +0100
Subject: [drakma-devel] Re: Portability of Drakma
In-Reply-To: (Erik
Huelsmann's message of "Sat, 24 Feb 2007 22:35:10 +0100")
References:
Message-ID:
Hi Erik,
I'm sending a copy of this to the mailing list where I think we should
continue this discussion.
On Sat, 24 Feb 2007 22:35:10 +0100, "Erik Huelsmann" wrote:
> I've been working on a very portable library for sockets code. This
> library is now more portable than trivial-sockets and supports more
> functionality on all of its supported lisp implementations.
>
> If you want to support the same platforms (and all the ones I'll be
> adding), you could switch from the -unmaintained- trivial-sockets to
> usocket (http://common-lisp.net/project/usocket/).
>
> I'm merely sending this mail to point out the existence of the
> library, in case you didn't know. Thanks for your time, attention
> and continued support for Common Lisp libraries.
I'm aware of usockets' existence because Andreas Fuchs pointed it out
to me shortly after I had released the portable version of Drakma
(using trivial-sockets). At that point I tried to switch to usocket
and immediately ran into problems - IIRC it didn't even load on
LispWorks on Windows (although ISTR the website claimed that LispWorks
was a supported implementation), and it couldn't provide binary socket
streams for all supported implementations. So, I dismissed it for the
time being.
It might well be the case that both of these issues have been fixed
since, but I currently don't have the time to test again. I generally
think it's better to rely on a maintained and documented library than
on obscure and old code, but of course the new code should work at
least as good as the old one.
I'd be happy to accept patches to switch Drakma from trivial-sockets
to usocket, but the following criteria should be met:
- The LispWorks code should remain untouched (i.e. not use usocket).
- The code should have been tested successfully on at least the
Lisp/OS combinations that are currently supported by Drakma.
The actual patch itself should be a piece of cake, but I guess the
testing will take some time.
Thanks,
Edi.
From jeffrey at cunningham.net Sun Feb 25 00:39:54 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sat, 24 Feb 2007 16:39:54 -0800
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To:
References: <20070224170725.GA23865@achilles.olympus.net>
Message-ID: <20070225003954.GA32401@achilles.olympus.net>
On Sat Feb 24, 2007 at 09:47:15PM +0100, Edi Weitz wrote:
> My guess is that the website sends wrong content-type headers. (Or,
> in other words, it claims to send UTF-8 but it doesn't.) This is not
> unusual. See the mailing list archive of the last weeks for similar
> problems and for workarounds.
>
> If you still think this is a bug in FLEXI-STREAMS, send a simple,
> reproducible test case and point out where in the sequence of
> characters FLEXI-STREAMS thinks it's not UTF-8 anymore although it is.
I believe you are right - incorrectly identified content-type. This
gets it to work:
(setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0))
(setf flexi-streams::*PROVIDE-USE-VALUE-RESTART* t)
(http-request "http://www.gifttree.com/Christmas/Christmas-gift-idea.html")
And I read about the performance hit associated with setting this up
as a default. But it seems like it raises some issues - at least for
what I'm doing, which is trying to automate updating information about
some sites I have no control over. In this case I set it to make a
substitution for the 'bad' character. Is it possible for there to be
more than one? If so, how could that be handled?
And more generally, should there not be a way to set drakma so it may
take a performance hit but is guaranteed not to die on any html that
is thrown at it?
Thanks,
--Jeff
From edi at agharta.de Sun Feb 25 10:25:04 2007
From: edi at agharta.de (Edi Weitz)
Date: Sun, 25 Feb 2007 11:25:04 +0100
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225003954.GA32401@achilles.olympus.net> (Jeffrey
Cunningham's message of "Sat, 24 Feb 2007 16:39:54 -0800")
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
Message-ID:
On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote:
> In this case I set it to make a substitution for the 'bad'
> character. Is it possible for there to be more than one?
Not yet. See current discussion on the FLEXI-STREAMS mailing list.
> And more generally, should there not be a way to set drakma so it
> may take a performance hit but is guaranteed not to die on any html
> that is thrown at it?
It's not dying, it just signals an error.
And, no, I don't think there's a way to provide meaningful results and
at the same time to be prepared to accept whatever bogus data or
headers the server choses to send. If you find something like that,
send patches, but it sounds like magic (or at least very good AI) to
me.
As for dealing with wrong character encodings, there are already ways
to deal with that. You cited one yourself. Another one would be to
read everything as binary data (and then to decode it yourself it
needed).
From ehuels at gmail.com Sun Feb 25 15:35:39 2007
From: ehuels at gmail.com (Erik Huelsmann)
Date: Sun, 25 Feb 2007 16:35:39 +0100
Subject: [drakma-devel] Fwd: Portability of Drakma
In-Reply-To:
References:
Message-ID:
Forwarding rejected message.
I wasn't subscribed yet. Sorry.
bye,
Erik.
---------- Forwarded message ----------
From: Erik Huelsmann
Date: Feb 25, 2007 2:38 PM
Subject: Re: Portability of Drakma
To: Edi Weitz
Cc: drakma-devel at common-lisp.net
On 2/24/07, Edi Weitz wrote:
> Hi Erik,
>
> I'm sending a copy of this to the mailing list where I think we should
> continue this discussion.
Ah. Sorry about that, I wasn't aware of this list.
> > I've been working on a very portable library for sockets code. This
> > library is now more portable than trivial-sockets and supports more
> > functionality on all of its supported lisp implementations.
>
> I'm aware of usockets' existence because Andreas Fuchs pointed it out
> to me shortly after I had released the portable version of Drakma
> (using trivial-sockets). At that point I tried to switch to usocket
> and immediately ran into problems - IIRC it didn't even load on
> LispWorks on Windows (although ISTR the website claimed that LispWorks
> was a supported implementation), and it couldn't provide binary socket
> streams for all supported implementations. So, I dismissed it for the
> time being.
That's both great and bad news: It's great you're aware of the usocket
project, it's too bad you tried and failed.
> It might well be the case that both of these issues have been fixed
> since, but I currently don't have the time to test again. I generally
> think it's better to rely on a maintained and documented library than
> on obscure and old code, but of course the new code should work at
> least as good as the old one.
Absolutely. New code shouldn't be a step backward. With that
requirement, a chicken-and-egg problem is introduced though: to
develop well-tested code, it needs to be (widely) used.
But to address your findings: you probably used one of the very first
releases: With 0.3.0, binary streams are supported on all
implementations. Next to that, I just downloaded and used LW5.0 to
test a simple GET request: all seems to work well. Indeed have there
been win32 related fixes to many backends.
> I'd be happy to accept patches to switch Drakma from trivial-sockets
> to usocket, but the following criteria should be met:
>
> - The LispWorks code should remain untouched (i.e. not use usocket).
>
> - The code should have been tested successfully on at least the
> Lisp/OS combinations that are currently supported by Drakma.
Is there a list somewhere as a reference to what I'm getting into?
> The actual patch itself should be a piece of cake, but I guess the
> testing will take some time.
Yes. Not having a Mac, I won't be able to test OpenMCL myself, but
maybe others can assist there?
Thanks for your time.
bye,
Erik.
From jeffrey at cunningham.net Sun Feb 25 16:26:45 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sun, 25 Feb 2007 08:26:45 -0800
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To:
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
Message-ID: <20070225162645.GB16675@achilles.olympus.net>
On Sun Feb 25, 2007 at 11:25:04AM +0100, Edi Weitz wrote:
> On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote:
>
> > In this case I set it to make a substitution for the 'bad'
> > character. Is it possible for there to be more than one?
>
> Not yet. See current discussion on the FLEXI-STREAMS mailing list.
>
> > And more generally, should there not be a way to set drakma so it
> > may take a performance hit but is guaranteed not to die on any html
> > that is thrown at it?
>
> It's not dying, it just signals an error.
>
> And, no, I don't think there's a way to provide meaningful results and
> at the same time to be prepared to accept whatever bogus data or
> headers the server choses to send. If you find something like that,
> send patches, but it sounds like magic (or at least very good AI) to
> me.
I guess I disagree.
If I try to access a page like that using: links, lynx, wget, mozilla,
firefox, or any html parsing entity I can think of they don't stop
functioning, signal an error, or whatever you want to call it. They
give me their best approximation of the content. Seems like that ought
be the goal here, or at least a possibility.
In an automated process, signaling an error means that processing has
stopped (or 'died'). The source of the error signal may be in
flexi-streams (I have read the discussions in the that list), but its
drakma that has to deal with its consequences.
How do the above mentioned applications manage this problem? Certainly
not by magic. And I doubt the AI in links or lynx is very
sophisticated.
--Jeff
From vodonosov at mail.ru Sun Feb 25 17:00:03 2007
From: vodonosov at mail.ru (Anton Vodonosov)
Date: Sun, 25 Feb 2007 19:00:03 +0200
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225162645.GB16675@achilles.olympus.net>
References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
Message-ID: <45E1C093.2060800@mail.ru>
Hi, Jeff.
"Signaling an error" means in this case that
work can be proceeded.
(setq *provide-use-value-restart* t)
(handler-bind
((flexi-stream-encoding-error (lambda (condition)
(use-value \?))))
(drakma:http-request("http://bad-host/bad-page.html")))
This is example from flexi-stream documentation.
You can easy get "the best approximation of the content"
using drakma, but with more control. So it is unclear to my,
what problems you have.
-Anton
Jeffrey Cunningham:
> On Sun Feb 25, 2007 at 11:25:04AM +0100, Edi Weitz wrote:
>> On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote:
>>
>>> In this case I set it to make a substitution for the 'bad'
>>> character. Is it possible for there to be more than one?
>> Not yet. See current discussion on the FLEXI-STREAMS mailing list.
>>
>>> And more generally, should there not be a way to set drakma so it
>>> may take a performance hit but is guaranteed not to die on any html
>>> that is thrown at it?
>> It's not dying, it just signals an error.
>>
>> And, no, I don't think there's a way to provide meaningful results and
>> at the same time to be prepared to accept whatever bogus data or
>> headers the server choses to send. If you find something like that,
>> send patches, but it sounds like magic (or at least very good AI) to
>> me.
>
> I guess I disagree.
>
> If I try to access a page like that using: links, lynx, wget, mozilla,
> firefox, or any html parsing entity I can think of they don't stop
> functioning, signal an error, or whatever you want to call it. They
> give me their best approximation of the content. Seems like that ought
> be the goal here, or at least a possibility.
>
> In an automated process, signaling an error means that processing has
> stopped (or 'died'). The source of the error signal may be in
> flexi-streams (I have read the discussions in the that list), but its
> drakma that has to deal with its consequences.
>
> How do the above mentioned applications manage this problem? Certainly
> not by magic. And I doubt the AI in links or lynx is very
> sophisticated.
>
>
> --Jeff
>
>
>
>
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
>
From jeffrey at cunningham.net Sun Feb 25 17:23:45 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sun, 25 Feb 2007 09:23:45 -0800
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <45E1C093.2060800@mail.ru>
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
<45E1C093.2060800@mail.ru>
Message-ID: <20070225172345.GA23630@achilles.olympus.net>
On Sun Feb 25, 2007 at 07:00:03PM +0200, Anton Vodonosov wrote:
> Hi, Jeff.
>
> "Signaling an error" means in this case that
> work can be proceeded.
>
> (setq *provide-use-value-restart* t)
>
> (handler-bind
> ((flexi-stream-encoding-error (lambda (condition)
>
> (use-value \?))))
> (drakma:http-request("http://bad-host/bad-page.html")))
>
>
> This is example from flexi-stream documentation.
>
> You can easy get "the best approximation of the content"
> using drakma, but with more control. So it is unclear to my,
> what problems you have.
>
> -Anton
Hi Anton,
Thanks for the help. Will the example above work for any bad
charactor, or only the one set by
(setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0))
The only example I've run across is the site I mentioned, but it seems
like the possibilities for bad html are endless.
--Jeff
From vodonosov at mail.ru Sun Feb 25 17:43:26 2007
From: vodonosov at mail.ru (Anton Vodonosov)
Date: Sun, 25 Feb 2007 19:43:26 +0200
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225172345.GA23630@achilles.olympus.net>
References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru>
<20070225172345.GA23630@achilles.olympus.net>
Message-ID: <45E1CABE.8020605@mail.ru>
You misunderstand meaning of *substitution-char*.
This is the character that will be used as a
substitution for all badly encoded characters.
Thus, this example is equvalent to
(setq flexi-streams::*provide-use-value-restart* t)
(setf flexi-streams::*SUBSTITUTION-CHAR* \?)
You will have ? instead of any wrong character.
I.e. you can use the whatever mechanism you like:
*substitution-char* for most cases or use-value-restart
if you whant more control (for example you what to
use ? as a substitution for even wrong byte sequence,
and * for odd wrong byte sequence; count encoding errors,
log them into file or something)
Read the docs, http://weitz.de/flexi-streams/
-Anton
Jeffrey Cunningham:
> On Sun Feb 25, 2007 at 07:00:03PM +0200, Anton Vodonosov wrote:
>> Hi, Jeff.
>>
>> "Signaling an error" means in this case that
>> work can be proceeded.
>>
>> (setq *provide-use-value-restart* t)
>>
>> (handler-bind
>> ((flexi-stream-encoding-error (lambda (condition)
>>
>> (use-value \?))))
>> (drakma:http-request("http://bad-host/bad-page.html")))
>>
>>
>> This is example from flexi-stream documentation.
>>
>> You can easy get "the best approximation of the content"
>> using drakma, but with more control. So it is unclear to my,
>> what problems you have.
>>
>> -Anton
>
> Hi Anton,
>
> Thanks for the help. Will the example above work for any bad
> charactor, or only the one set by
>
> (setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0))
>
> The only example I've run across is the site I mentioned, but it seems
> like the possibilities for bad html are endless.
>
> --Jeff
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
>
From nowhere.man at levallois.eu.org Sun Feb 25 18:06:25 2007
From: nowhere.man at levallois.eu.org (Pierre THIERRY)
Date: Sun, 25 Feb 2007 19:06:25 +0100
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225162645.GB16675@achilles.olympus.net>
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
Message-ID: <20070225180625.GW7500@bateleur.arcanes.fr.eu.org>
Scribit Jeffrey Cunningham dies 25/02/2007 hora 08:26:
> > If you find something like that, send patches, but it sounds like
> > magic (or at least very good AI) to me.
>
> I guess I disagree.
>
> If I try to access a page like that using: links, lynx, wget, mozilla,
> firefox, or any html parsing entity I can think of they don't stop
> functioning, signal an error, or whatever you want to call it. They
> give me their best approximation of the content. Seems like that ought
> be the goal here, or at least a possibility.
AFAICS, those browsers just substitute bad bytes with a single
substitution glyph. My Firefox uses a white interrogation mark in a
black diamond.
You can already achieve that with flexi-streams, IIUC.
Quickly,
Pierre
--
nowhere.man at levallois.eu.org
OpenPGP 0xD9D50D8A
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
From jeffrey at cunningham.net Sun Feb 25 19:34:06 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sun, 25 Feb 2007 11:34:06 -0800
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <45E1CABE.8020605@mail.ru>
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
<45E1C093.2060800@mail.ru>
<20070225172345.GA23630@achilles.olympus.net>
<45E1CABE.8020605@mail.ru>
Message-ID: <20070225193406.GA26412@achilles.olympus.net>
On Sun Feb 25, 2007 at 07:43:26PM +0200, Anton Vodonosov wrote:
> You misunderstand meaning of *substitution-char*.
> This is the character that will be used as a
> substitution for all badly encoded characters.
>
> Thus, this example is equvalent to
> (setq flexi-streams::*provide-use-value-restart* t)
> (setf flexi-streams::*SUBSTITUTION-CHAR* \?)
>
> You will have ? instead of any wrong character.
>
> I.e. you can use the whatever mechanism you like:
> *substitution-char* for most cases or use-value-restart
> if you whant more control (for example you what to
> use ? as a substitution for even wrong byte sequence,
> and * for odd wrong byte sequence; count encoding errors,
> log them into file or something)
You're right, Anton, I did misunderstand the meaning. Thank you for
clearing that up.
--Jeff
From edi at agharta.de Sun Feb 25 20:38:15 2007
From: edi at agharta.de (Edi Weitz)
Date: Sun, 25 Feb 2007 21:38:15 +0100
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225162645.GB16675@achilles.olympus.net> (Jeffrey
Cunningham's message of "Sun, 25 Feb 2007 08:26:45 -0800")
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
Message-ID:
On Sun, 25 Feb 2007 08:26:45 -0800, Jeffrey Cunningham wrote:
> If I try to access a page like that using: links, lynx, wget,
> mozilla, firefox, or any html parsing entity I can think of they
> don't stop functioning, signal an error, or whatever you want to
> call it. They give me their best approximation of the content. Seems
> like that ought be the goal here, or at least a possibility.
>
> In an automated process, signaling an error means that processing
> has stopped (or 'died'). The source of the error signal may be in
> flexi-streams (I have read the discussions in the that list), but
> its drakma that has to deal with its consequences.
You are missing two crucial points:
1. The applications you listed are just that - monolithic
applications. You either use them for what they are intended or
you leave them alone. They'd better be as permissible as possible.
Drakma, OTOH, is a library - a tool or building block used by
programmers to build applications. It should do what it advertises
to do correctly - not more and not less. And if that's not what
the programmer expected, he can tweak it as much as he wants.
(That doesn't imply that he modifies the library itself, but as
Drakma is open source he can do even that, if deemed necessary.)
2. In Common Lisp, signalling an error doesn't mean that processing
has stopped. If that is news to you, you might want to read, for
example, the chapter about conditions and restarts in Peter
Seibel's book.
> How do the above mentioned applications manage this problem?
> Certainly not by magic.
In this specific case, they're usually doing it the same way you can
do it with Drakma and FLEXI-STREAMS - they insert some kind of
replacement character. I don't see where the problem is.
Cheers,
Edi.
From edi at agharta.de Sun Feb 25 20:16:58 2007
From: edi at agharta.de (Edi Weitz)
Date: Sun, 25 Feb 2007 21:16:58 +0100
Subject: [drakma-devel] Re: Portability of Drakma
In-Reply-To:
(Erik Huelsmann's message of "Sun, 25 Feb 2007 14:38:02 +0100")
References:
Message-ID:
On Sun, 25 Feb 2007 14:38:02 +0100, "Erik Huelsmann" wrote:
>> - The code should have been tested successfully on at least the
>> Lisp/OS combinations that are currently supported by Drakma.
>
> Is there a list somewhere as a reference to what I'm getting into?
No, unfortunately not. I myself use mostly LispWorks (Windows and
Linux/x86) and SBCL (Linux/x86). (LispWorks shouldn't be a problem
anyway as it's not affected by the switch.)
I think that at least LispWorks, SBCL, AllegroCL, CMUCL, CLISP, and
OpenMCL should be supported, everything else being a bonus. Operating
systems: Windows (where applicable), Linux, OS X.
If you don't want to test all of this for yourself, how about offering
a tarball of Drakma which uses usocket for download? Send an
announcement to this mailing list including an overview of what you've
tested and what not. We can ask "interested parties" to try it out
and we'll switch to the new version in four weeks, say, unless someone
objects. Does that sound OK?
Cheers,
Edi.
From edi at agharta.de Sun Feb 25 20:40:00 2007
From: edi at agharta.de (Edi Weitz)
Date: Sun, 25 Feb 2007 21:40:00 +0100
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225172345.GA23630@achilles.olympus.net> (Jeffrey
Cunningham's message of "Sun, 25 Feb 2007 09:23:45 -0800")
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
<45E1C093.2060800@mail.ru>
<20070225172345.GA23630@achilles.olympus.net>
Message-ID:
On Sun, 25 Feb 2007 09:23:45 -0800, Jeffrey Cunningham wrote:
> The only example I've run across is the site I mentioned, but it
> seems like the possibilities for bad html are endless.
The problems you've encountered have nothing to do with bad HTML at
all, and Drakma doesn't try to parse HTML. I think you're a bit
confused.
Cheers,
Edi.
From jeffrey at cunningham.net Sun Feb 25 20:47:39 2007
From: jeffrey at cunningham.net (Jeffrey Cunningham)
Date: Sun, 25 Feb 2007 12:47:39 -0800
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To:
References: <20070224170725.GA23865@achilles.olympus.net>
<20070225003954.GA32401@achilles.olympus.net>
<20070225162645.GB16675@achilles.olympus.net>
<45E1C093.2060800@mail.ru>
<20070225172345.GA23630@achilles.olympus.net>
Message-ID: <20070225204739.GA28011@achilles.olympus.net>
On Sun Feb 25, 2007 at 09:40:00PM +0100, Edi Weitz wrote:
>
> I think you're a bit confused.
I agree, but I'm slowly getting less confused. Thanks for the help.
-Jeff
From vodonosov at mail.ru Sun Feb 25 21:02:30 2007
From: vodonosov at mail.ru (Anton Vodonosov)
Date: Sun, 25 Feb 2007 23:02:30 +0200
Subject: [drakma-devel] Bug handling bad html?
In-Reply-To: <20070225193406.GA26412@achilles.olympus.net>
References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> <45E1CABE.8020605@mail.ru>
<20070225193406.GA26412@achilles.olympus.net>
Message-ID: <45E1F966.1010109@mail.ru>
Jeffrey Cunningham:
> You're right, Anton, I did misunderstand the meaning. Thank you for
> clearing that up.
>
Not at all. Thanks Edi for all that great software he creates for us.
-Anton
From rsynnott at gmail.com Tue Feb 27 13:00:10 2007
From: rsynnott at gmail.com (Robert Synnott)
Date: Tue, 27 Feb 2007 13:00:10 +0000
Subject: [drakma-devel] Fwd: Portability of Drakma
In-Reply-To:
References:
Message-ID: <24f203480702270500x1ab963cbhbf508435c4993dc7@mail.gmail.com>
On 2/25/07, Erik Huelsmann wrote:
...
> Yes. Not having a Mac, I won't be able to test OpenMCL myself, but
> maybe others can assist there?
>
> Thanks for your time.
>
>
> bye,
>
> Erik.
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
I can test on OpenMCL on a PPC Mac if desired.
Rob