[drakma-devel] charset errors question
Hans Hübner
hans.huebner at gmail.com
Mon Sep 24 17:47:59 UTC 2012
Jeff,
you can use the :FORCE-BINARY keyword argument to have DRAKMA return
the octets constituting the response, and then call
FLEXI-STREAMS:OCTETS-TO-STRING with an explicit external format to
force decoding using a particular external format, like so:
(flexi-streams:octets-to-string (drakma:http-request
"http://www.walmart.com" :force-binary t) :external-format :ascii)
HTH,
Hans
On Mon, Sep 24, 2012 at 7:01 PM, Jeff Cunningham
<jeffrey at jkcunningham.com> wrote:
> I've been running into some trouble using drakma to retrieve pages from
> certain commercial websites. It is very likely the HTML they are generating
> is broken one way or another. But the problem still remains as to how one
> can retrieve their pages using drakma.
>
> For example, if you try this simple case:
>
> (http-request "http://www.walmart.com")
>
> It will display the following:
>
> WARNING: Problems determining charset (falling back to binary):
> Corrupted Content-Type header:
> Read character #\;, but expected #\=.
>
> And the returned body is binary-encoded ascii. This can be converted to real
> ascii, of course, but it is inconvenient to say the least.
>
> Often the problem is that their metatag for the charset is simply wrong.
> Sometimes I can figure out what it is and supply this information, like
> this:
>
> (http-request "http://www.walmart.com" :external-format-in :UTF-8)
>
> and it will solve he problem. But this particular example does not lend
> itself to this, at least using the following charsets:
>
> :UTF-8
> :UTF-7
> :iso-8859-1
> :iso-8859-2
> :iso-8859-3
> :iso-8859-4
> :iso-8859-5
> :iso-8859-6
> :iso-8859-7
> :iso-8859-8
> :iso-8859-9
> :BIG5
> :US-ASCII
> :UTF-16
> :UTF-32
>
> I have no idea what their server is actually sending - it appears to be
> invalid for any of these charsets.
>
> Is there any way to get around this problem?
>
> Best regards,
> Jeff Cunningham
>
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>
More information about the Drakma-devel
mailing list