[Bese-devel] Re: character issues. aka: http is a binary protocol, get over it.

Marco Baringer mb at bese.it
Thu Dec 15 18:06:59 UTC 2005


Maciek Pasternacki <maciekp at japhy.fnord.org> writes:

> What is wrong about my solution with treating stream as
> iso-8859-1-encoded string (which is completely equivalent to binary
> stream), and recoding it when I expect text to be in another charset?
> On one hand it's a kind of hack, OTOH we work along the RFCs with
> Latin-1 text, as RFCs state (and as is easier to debug than parsing
> byte arrays), and after parsing, after all protocol-related work, we
> re-encode Latin-1 text to encoding expected by us (or decode it to
> byte arrays).  All encoding issues take place when they won't make
> trouble, and when they start being actually relevant.  Analogically
> with encoding reply to send out -- app works with Unicode text, when
> it starts being encoded in any way, it's being re-coded to transparent
> Latin-1 not to bother RFC-related code.

thi idea is fine, but we end up doing the conversion for everything
the user sends us, whether it's needed or not.  one of my apps works
with 50MB+ pdf files, converting this data to/from various encodings
is getting costly (and it's completly useless). unless i'm mistaken
both string-to-octets and octets-to-string are non-destructive
functions.

-- 
-Marco
Ring the bells that still can ring.
Forget the perfect offering.
There is a crack in everything.
That's how the light gets in.
	-Leonard Cohen



More information about the bese-devel mailing list