[Bese-devel] Encodings?

Pascal Bourguignon pjb at informatimago.com
Sat Aug 6 00:53:59 UTC 2005


What provisions exist to manage encodings?

I'm observing the following behavior:

With Mozilla 1.7, and with Safari, in a text area, I type (copy-paste
from emacs HELLO actually) random characters.  The text I receive
contains these characters encoded as ✏ sequences.  

Content of attribute as received in clisp:

| Japanese (日本語)	こんにちは, コンニチハ
| Chinese (中文,普通话,汉语)	你好
| Cantonese (粵語,廣東話)	早晨, 你好
| Korean (한글)	안녕하세요, 안녕하십니까
| 
| Difference among chinese characters in GB, JIS, KSC, BIG5:
| GB	元气  开发
| JIS	元気  開発
| KSC	元氣  開發
| BIG5	元氣  開發
 

But when I send back this text in the same textarea, the ampersands
get escaped:

Source of HTML received back:

<textarea cols="72" rows="4" name="EjJvepjqxS"
    >

Japanese (&#26085;&#26412;&#35486;)	&#12371;&#12435;&#12395;&#12385;&#12399;, &#65402;&#65437;&#65414;&#65409;&#65418;
Chinese (&#20013;&#25991;,&#26222;&#36890;&#35805;,&#27721;&#35821;)	&#20320;&#22909;
Cantonese (&#31925;&#35486;,&#24291;&#26481;&#35441;)	&#26089;&#26216;, &#20320;&#22909;
Korean (&#54620;&#44544;)	&#50504;&#45397;&#54616;&#49464;&#50836;, &#50504;&#45397;&#54616;&#49901;&#45768;&#44620;

Difference among chinese characters in GB, JIS, KSC, BIG5:
GB	&#20803;&#27668;  &#24320;&#21457;
JIS	&#20803;&#27671;  &#38283;&#30330;
KSC	&#20803;&#27683;  &#38283;&#30332;
BIG5	&#20803;&#27683;  &#38283;&#30332;

</textarea
  > 

So, it looks like either the text is not decoded on the way from the
browser to ucw, or is not encoded back from ucw to the browser.
Is this a bug or a feature?

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Grace personified,
I leap into the window.
I meant to do that.



More information about the bese-devel mailing list