[drakma-devel] charset errors question
jeffrey at jkcunningham.com
Mon Sep 24 17:01:47 UTC 2012
I've been running into some trouble using drakma to retrieve pages from
certain commercial websites. It is very likely the HTML they are
generating is broken one way or another. But the problem still remains
as to how one can retrieve their pages using drakma.
For example, if you try this simple case:
It will display the following:
WARNING: Problems determining charset (falling back to binary):
Corrupted Content-Type header:
Read character #\;, but expected #\=.
And the returned body is binary-encoded ascii. This can be converted to
real ascii, of course, but it is inconvenient to say the least.
Often the problem is that their metatag for the charset is simply wrong.
Sometimes I can figure out what it is and supply this information, like
(http-request "http://www.walmart.com" :external-format-in :UTF-8)
and it will solve he problem. But this particular example does not lend
itself to this, at least using the following charsets:
I have no idea what their server is actually sending - it appears to be
invalid for any of these charsets.
Is there any way to get around this problem?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Drakma-devel