[Ecls-list] Character encoding/decoding utilities

Matthew Mondor mm_lists at pulsar-zone.net
Fri Aug 26 10:27:49 UTC 2011


On Fri, 26 Aug 2011 05:52:07 -0400
Matthew Mondor <mm_lists at pulsar-zone.net> wrote:

> In other words, if I understand, something similar would become
> possible?
> 
> ;;; Write a unicode character string to an UTF-8 encoded bytes vector
> (let ((v (make-array 16 ; Expect implementation to adjust ^2 or *2 as needed
>                      :element-type 'byte
>                      :adjustable t
>                      :fill-pointer 0)))
>   (with-open-stream (os (make-sequence-output-stream
>                          v :external-format '(:UTF-8 :LF)))
>     (format os "some unicode string~%")
>     v)) ; Contains the UTF-8 encoded bytes
> 
> ;;; Read a unicode character string from an UTF-8 encoded bytes vector
> (let ((v <vector of bytes to read/decode>))
>   (with-open-stream (is (make-sequence-input-stream
>                          v :external-format '(:UTF-8 :LF)))
>     (read-line is))) ; UBCS-4 characters, may generate decoding exceptions

Oops, I meant '(unsigned-byte 8) above rather than 'byte.

I also noticed the existing READ-SEQUENCE/WRITE-SEQUENCE functions,
which also will give a coercion error from character to byte, but
wondered if it wouldn't be another possible route with some
modifications.  The standard says that an implementation might
signal an error if the types don't match, which wouldn't go against
encoding/decoding happening, but its wording also also seems to explain
that one input output element is expected per input element.  Also,
this wouldn't as-is permit to specify the wanted external format.
-- 
Matt




More information about the ecl-devel mailing list