[Ecls-list] Project status and changes (directions? help?)
Pascal J. Bourguignon
pjb at informatimago.com
Mon Oct 21 10:24:50 UTC 2013
Matthew Mondor <mm_lists at pulsar-zone.net> writes:
> I'm unsure if it's because we are too few on this list, if ECL users
> are alltogether too few (I know the CL community itself is not large),
> or if it's because most ECL users don't consider themselves qualified,
> but I do remember Juan asking a few times on this list for help, and
> recently he reminded us that noone actually joined the project yet.
> It's also possible that at such requests we had no idea in which
> direction help was needed.
> I'm reticent to ask to officially maintain ECL for the following
> - Time constraints, as always
I'm in a similar situation. While I would be enchanted to be able to
maintain ecl (and improve it notably on iOS and Android), the big
problem is that of time (ie. money). The other points could be solved
on the job.
> - I think that eventual transparent support for "UTF8-B" with streams
> would be a good idea. In this mode invalid UTF-8 sequences would be
> read and mapped automatically to characters within the UTF-16
> surrogate range, to preserve their original octets. In output mode,
> those would be output as octets. This would provide lower overhead
> than having to use signals to detect invalid sequences and restarts
> to convert them, and would allow to preserve the original "mixed
> encoding" data. Examples where mixed UTF-8 and LATIN-* data are
> common are filenames in some file systems, IRC lines, etc. Currently
> one can gracefully deal with these when treated as bytes or 8-bit
> characters, unless one also needs to have an extended character set
> interface, in which case some conversion must be done anyway.
> If I remember there exists a limitation in the current system to
> easily implement this without touching the stream or
> encoding/decoding system much, but I forgot the details (I have notes
> somewhere on this though).
The HTML experiment would indicate that this is probably not a good
idea. Recovering the meaning of invalid byte sequences and converting
them to a normalized form can be the job of a tool, but it should not be
done automatically by general applications.
When reading utf-8 or other unicode streams, invalid byte sequences can
signal errors, be substituted by a given character, or be encoded into
application reseved code points to be able to transparently transmit the
invalid byte sequence. Cf. clisp :INPUT-ERROR-ACTION parameter of
ext:make-encoding (clisp encodings are external-format values).
More information about the ecl-devel