[babel-devel] octets-to-string with UTF8 and Byte Order Marker
Luís Oliveira
luismbo at gmail.com
Wed May 11 07:32:33 UTC 2011
Hello,
Sorry for the late reply.
On Thu, Apr 21, 2011 at 10:36 PM, Rob Blackwell <rob.blackwell at aws.net> wrote:
> I'm still a little confused as to why the length is 4 and not 3 - shouldn’t the byte order mark have been discarded?
I'm not sure. I couldn't find any clear indications on how leading
BOMs should be handled for UTF-8. The BOM FAQ seems to indicate they
should be converted to ZERO WIDTH NON-BREAKING SPACEs, maybe. Any
comments? It would perhaps be interesting to check what well
established libraries such as ICU do.
Cheers,
--
Luís Oliveira
http://r42.eu/~luis/
More information about the babel-devel
mailing list