[Ecls-list] Invalid octet sequence -> SIMPLE-ERROR

Matthew Mondor mm_lists at pulsar-zone.net
Sun Aug 2 20:55:42 UTC 2009


On Wed, 29 Jul 2009 21:41:08 -0400
Matthew Mondor <mm_lists at pulsar-zone.net> wrote:

> On Wed, 29 Jul 2009 15:29:17 -0400
> Matthew Mondor <mm_lists at pulsar-zone.net> wrote:
> 
> > Stream #<io stream FD-STREAM> with external format (UTF-8 LF) contains
> > an invalid octet sequence.
> >    [Condition of type SIMPLE-ERROR]
> 
> This is not directly related but I also noticed read(2)/write(2)
> syscalls being made for every byte when using a stream over a socket
> despite passing :buffering :line to socket-make-stream.  I've not
> looked at the internal code for this yet.

After implementing custom-read-line I realized how handling non-utf8
sequences was different on every implementation (i.e. ECL signals
SIMPLE-ERROR after consuming the illegal sequence, SBCL signals
SB-INT:STREAM-DECODING-ERROR without consuming the illegal sequence and
needs an SB-INT:ATTEMPT-RESYNC restart invokation).  I will probably
look at using custom buffering with babel, or flexi-streams with
encoding-agnostic socket-connected streams.

However, since ECL does support unicode, it might be good to be able to
access the invalid bytes like SBCL allows (in which case they aren't
consumed unless the resync restart is invoked).  This would allow to
for instance interpret these as latin-1 and convert them to the
internal unicode format...

Thanks,
-- 
Matt




More information about the ecl-devel mailing list