[drakma-devel] charset errors question
hans.huebner at gmail.com
Mon Sep 24 17:47:59 UTC 2012
you can use the :FORCE-BINARY keyword argument to have DRAKMA return
the octets constituting the response, and then call
FLEXI-STREAMS:OCTETS-TO-STRING with an explicit external format to
force decoding using a particular external format, like so:
"http://www.walmart.com" :force-binary t) :external-format :ascii)
On Mon, Sep 24, 2012 at 7:01 PM, Jeff Cunningham
<jeffrey at jkcunningham.com> wrote:
> I've been running into some trouble using drakma to retrieve pages from
> certain commercial websites. It is very likely the HTML they are generating
> is broken one way or another. But the problem still remains as to how one
> can retrieve their pages using drakma.
> For example, if you try this simple case:
> (http-request "http://www.walmart.com")
> It will display the following:
> WARNING: Problems determining charset (falling back to binary):
> Corrupted Content-Type header:
> Read character #\;, but expected #\=.
> And the returned body is binary-encoded ascii. This can be converted to real
> ascii, of course, but it is inconvenient to say the least.
> Often the problem is that their metatag for the charset is simply wrong.
> Sometimes I can figure out what it is and supply this information, like
> (http-request "http://www.walmart.com" :external-format-in :UTF-8)
> and it will solve he problem. But this particular example does not lend
> itself to this, at least using the following charsets:
> I have no idea what their server is actually sending - it appears to be
> invalid for any of these charsets.
> Is there any way to get around this problem?
> Best regards,
> Jeff Cunningham
> drakma-devel mailing list
> drakma-devel at common-lisp.net
More information about the Drakma-devel