[drakma-devel] charset errors question

Hans Hübner hans.huebner at gmail.com
Mon Sep 24 17:47:59 UTC 2012


Jeff,

you can use the :FORCE-BINARY keyword argument to have DRAKMA return
the octets constituting the response, and then call
FLEXI-STREAMS:OCTETS-TO-STRING with an explicit external format to
force decoding using a particular external format, like so:

(flexi-streams:octets-to-string (drakma:http-request
"http://www.walmart.com" :force-binary t) :external-format :ascii)

HTH,
Hans

On Mon, Sep 24, 2012 at 7:01 PM, Jeff Cunningham
<jeffrey at jkcunningham.com> wrote:
> I've been running into some trouble using drakma to retrieve pages from
> certain commercial websites. It is very likely the HTML they are generating
> is broken one way or another. But the problem still remains as to how one
> can retrieve their pages using drakma.
>
> For example, if you try this simple case:
>
> (http-request "http://www.walmart.com")
>
> It will display the following:
>
> WARNING: Problems determining charset (falling back to binary):
> Corrupted Content-Type header:
> Read character #\;, but expected #\=.
>
> And the returned body is binary-encoded ascii. This can be converted to real
> ascii, of course, but it is inconvenient to say the least.
>
> Often the problem is that their metatag for the charset is simply wrong.
> Sometimes I can figure out what it is and supply this information, like
> this:
>
> (http-request "http://www.walmart.com" :external-format-in :UTF-8)
>
> and it will solve he problem. But this particular example does not lend
> itself to this, at least using the following charsets:
>
>  :UTF-8
>  :UTF-7
>  :iso-8859-1
>  :iso-8859-2
>  :iso-8859-3
>  :iso-8859-4
>  :iso-8859-5
>  :iso-8859-6
>  :iso-8859-7
>  :iso-8859-8
>  :iso-8859-9
>  :BIG5
>  :US-ASCII
>  :UTF-16
>  :UTF-32
>
> I have no idea what their server is actually sending - it appears to be
> invalid for any of these charsets.
>
> Is there any way to get around this problem?
>
> Best regards,
> Jeff Cunningham
>
> _______________________________________________
> drakma-devel mailing list
> drakma-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
>




More information about the Drakma-devel mailing list