[postmodern-devel] character encoding

Marijn Haverbeke marijnh at gmail.com
Thu Sep 11 18:45:43 UTC 2008


Hey Roman,

> I notice a non-ascii char: the registered sign. The database is in SQL_ASCII
> encoding, but it contains non ascii chars.

Ah, I somehow missed the (R). This seems like rather awful behaviour
on the part of postgres -- there's no (R) in ASCII, and sending it in
latin encoding over a utf-8 encoded connection is just wrong. But yes,
it would be nice if Postmodern didn't choke on it. However, I have no
idea how to distinguish cases like this from genuine utf-8 without
completely killing performance.

> So, it doesn't have anything to do with postmodern. However, maybe it would
> be useful to be able to set
> encoding mode per connection? Any other suggestions?

That would probably be nice. The current character-encoding approach
is rather ad-hoc -- babel didn't exist yet at the time, and
flexi-streams was so slow it wasn't even an option, but nowadays it
should be possible to use babel to decode any format the server is
likely to throw at us. Unfortunately, the current behaviour is not
wrong enough to distress me to the point of making it a priority to
fix it, and I have enough stuff on my hands. If anyone feels up to it,
I'd be happy to review and incorporate a patch, of course.

Best,
Marijn



More information about the postmodern-devel mailing list