[climacs-devel] Unicode support in Climacs

Daniel CAUNE dcaune at majormode.com
Sat Jan 14 05:17:37 UTC 2006


> JC Helary writes:
>  > tell you that in 2006, any text editor that does not support Unicode
>  > for at least the 10 biggest languages in the world is not going to go
>  > very far.
>  >
>  > And that has to be intuitive too. If possible with no need to set
>  > obscure parameters hidden in the GUI (a la Mule), with no need to
>  > install separate input systems (a la X11) etc.
> 
> There are two cases, one easy and one hard.  The easy one is when the
> keyboard and X11 are configured to send the right events.  Then, all
> Climacs has to do is insert the corresponding Unicode character.  The
> hard one is when you need to type characters that do not appeaar on
> your keyboard.  In my opinion, though, this should not be dealt with
> at the level of Climacs, but globally in CLIM, or perhaps the X11
> backend, or even by X11 itself (for consistency between
> applications).
> 

Yes, I think you are right.  For instance, such a keyboard abstraction, aka Input Method Editor (IME), is integrated in operating systems such as Windows and MacOS X.  Translation from scan code tuples to Unicode characters are hidden to the GUI developer (not sure for the Windows developer!).

>  > Everything from input to display to file saving/opening has to be
>  > relatively smooth otherwise the application is useless.
>  >
>  > I read somewhere that the only Common Lisp that supports Unicode
>  > fully is CLISP, it is a pity the other are either not "compliant" or
>  > default on latin-1.
> 
> I think SBCL fully supports Unicode as well.
> 

SBCL supports Unicode characters, but I don't know so far whether SBCL is fully compliant with Unicode standard (Java is not, for instance).  But is that a big deal?  Full compliance can be managed by Climacs itself for the moment.

>  > It is about time people understand that "text" in "text editor" is
>  > not anymore synonymous with "ascii"... :)
> 
> I agree.  It was for that reason Climacs was designed from the start
> so that the buffer can contain any Unicode character, and in fact, any
> CL object.
> 

My next comment might be a bit out of purpose.  Anyway it could be interesting to have the identification of the language stored within the character, or, more specifically, any identification of the IME used to enter that character.  For which usage would it be interesting?  Spelling purpose, automatic IME switch when character cursor moves around, text search, etc..  I have no clear idea if such information should be managed by the character layer (buffer layer?), or by any other higher text layer.  Just my two cents.

--
Daniel CAUNE




More information about the climacs-devel mailing list