[Ecls-list] unicode troubles

Juan Jose Garcia-Ripoll juanjose.garciaripoll at gmail.com
Mon Jul 23 14:01:09 UTC 2012


On Mon, Jul 23, 2012 at 9:17 AM, Matthew Mondor <mm_lists at pulsar-zone.net>wrote:

> Examples using those sequence streams can be found in the
> utf-8-string-encode and utf-8-string-decode functions at
>
> http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/mmsoftware/cl/server/character.lisp?rev=1.3;content-type=text%2Fplain


This is a very neat example and the main reason why sequence streams were
introduced. Just to inform the original poster, ECL defines a string as an
array of characters, either with 8 bits or with 24. This is schematized
here http://ecls.sourceforge.net/new-manual/ch11.html#ansi.character-typesand
here
http://ecls.sourceforge.net/new-manual/ch14.html#ansi.strings.types

ECL does not use utf8 encoding internally because it causes a lot of
headaches to write all other routines, such as string accessors, string
operations, etc, which were designed with an array of characters in mind,
not a collection of characters that must be accessed sequentially (or
randomly, but with a cost O(n)).

Moving to utf8 would imply seriously revising all of ECL and there are
other priorities right now. But as Matthew showed, it is quite feasible to
do the conversion to and from utf8 using ECL's routines.

Juanjo

-- 
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/ecl-devel/attachments/20120723/17785e9c/attachment.html>


More information about the ecl-devel mailing list