[Bese-devel] Re: character issues. aka: http is a binary protocol, get over it.
Pascal Bourguignon
pjb at informatimago.com
Thu Dec 15 20:32:14 UTC 2005
Maciek Pasternacki writes:
> What is wrong about my solution with treating stream as
> iso-8859-1-encoded string (which is completely equivalent to binary
> stream), and recoding it when I expect text to be in another charset?
For one thing, on most CL implementations, characters will take more
(much more) than one byte of space, even when only iso-8859-1
characters. Then you convert three times when only one or zero
conversion is needed:
octet->iso-8859-1->octet->actual encoding
instead of: octet->actual encoding ; for text
or just: octet ; for binary data
> On one hand it's a kind of hack, OTOH we work along the RFCs with
> Latin-1 text, as RFCs state (and as is easier to debug than parsing
> byte arrays), and after parsing, after all protocol-related work, we
> re-encode Latin-1 text to encoding expected by us (or decode it to
> byte arrays). All encoding issues take place when they won't make
> trouble, and when they start being actually relevant. Analogically
> with encoding reply to send out -- app works with Unicode text, when
> it starts being encoded in any way, it's being re-coded to transparent
> Latin-1 not to bother RFC-related code.
--
__Pascal Bourguignon__ http://www.informatimago.com/
You're always typing.
Well, let's see you ignore my
sitting on your hands.
More information about the bese-devel
mailing list