[Bese-devel] Re: character issues. aka: http is a binary protocol, get over it.
Maciek Pasternacki
maciekp at japhy.fnord.org
Thu Dec 15 18:19:35 UTC 2005
On Prickle-Prickle, The Aftermath 57, 3171 YOLD, Marco Baringer wrote:
>> What is wrong about my solution with treating stream as
>> iso-8859-1-encoded string (which is completely equivalent to binary
>> stream), and recoding it when I expect text to be in another charset?
>> On one hand it's a kind of hack, OTOH we work along the RFCs with
>> Latin-1 text, as RFCs state (and as is easier to debug than parsing
>> byte arrays), and after parsing, after all protocol-related work, we
>> re-encode Latin-1 text to encoding expected by us (or decode it to
>> byte arrays). All encoding issues take place when they won't make
>> trouble, and when they start being actually relevant. Analogically
>> with encoding reply to send out -- app works with Unicode text, when
>> it starts being encoded in any way, it's being re-coded to transparent
>> Latin-1 not to bother RFC-related code.
>
> thi idea is fine, but we end up doing the conversion for everything
> the user sends us, whether it's needed or not. one of my apps works
> with 50MB+ pdf files, converting this data to/from various encodings
> is getting costly (and it's completly useless). unless i'm mistaken
> both string-to-octets and octets-to-string are non-destructive
> functions.
But conversion still happens after rfc2388, so if we deal with large
files and rfc2388 can write to disk now, rfc2388 gives me a pathname
which would not be recoded.
Or, we can work with iso8859-1 cookies and let user recode values as
needed. Or differentiate it by application.charset (e.g. when
application.charset is nil or :iso-8859-1, no re-encoding takes
place).
--
__ Maciek Pasternacki <maciekp at japhy.fnord.org> [ http://japhy.fnord.org/ ]
`| _ |_\ / { Miał rację Sokrates, jest zdumiewająca
,|{-}|}| }\/ ilość rzeczy, bez których lepiej nam niż z nimi. }
\/ |____/ ( Jacek Podsiadło ) -><-
More information about the bese-devel
mailing list