[drakma-devel] Bug handling bad html?

Edi Weitz edi at agharta.de
Sun Feb 25 10:25:04 UTC 2007


On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham <jeffrey at cunningham.net> wrote:

> In this case I set it to make a substitution for the 'bad'
> character. Is it possible for there to be more than one?

Not yet.  See current discussion on the FLEXI-STREAMS mailing list.

> And more generally, should there not be a way to set drakma so it
> may take a performance hit but is guaranteed not to die on any html
> that is thrown at it?

It's not dying, it just signals an error.

And, no, I don't think there's a way to provide meaningful results and
at the same time to be prepared to accept whatever bogus data or
headers the server choses to send.  If you find something like that,
send patches, but it sounds like magic (or at least very good AI) to
me.

As for dealing with wrong character encodings, there are already ways
to deal with that.  You cited one yourself.  Another one would be to
read everything as binary data (and then to decode it yourself it
needed).



More information about the Drakma-devel mailing list