[babel-devel] question about #\Nul char and Unicode

Wed Aug 4 14:07:18 UTC 2010

Hi,

I'm stuck with a problem:  I'm using CL-ZMQ, that in turn uses CFFI, that in
turn uses BABEL for such tasks as FOREIGN-STRING-TO-LISP conversion.
There seams to be a problem with 0 (#\Nul) characters for such strings,
which can be seen below:

Illegal :UTF-8 character starting at position 328.
   [Condition of type BABEL-ENCODINGS:INVALID-UTF8-CONTINUATION-BYTE]

Restarts:
...

Backtrace:
  0: ((LAMBDA (BABEL-ENCODINGS::SRC BABEL-ENCODINGS::START
BABEL-ENCODINGS::END BABEL-ENCODINGS::DEST BABEL-ENCODINGS::D-START)) ..)
  1: (CFFI:FOREIGN-STRING-TO-LISP #.(SB-SYS:INT-SAP #X0808E13C))[:EXTERNAL]
...

The translated string in the current example is this:
#(#\5 #\4 #\c #\6 #\7 #\5 #\5 #\b #\- #\9 #\6 #\2 #\8 #\- #\4 #\0 #\a #\4
#\- #\9 #\a #\2 #\d #\- #\c #\c #\8 #\2 #\a #\8 #\1 #\6 #\3 #\4 #\5 #\e #\
#\1 #\8 #\  #\/ #\h #\a #\n #\d #\l #\e #\r #\t #\e #\s #\t #\  #\2 #\6 #\0
#\Space #\{ #\" #\P #\A #\T #\H #\" #\Space #\" #\/ #\h #\a #\n #\d #\l #\e
#\r #\t #\e #\s #\t #\" #\, #\" #\M #\E #\T #\H #\O #\D #\" #\Space #\" #\G
#\E #\T #\" #\, #\" #\V #\E #\R #\S #\I #\O #\N #\" #\Space #\" #\H #\T #\T
#\P #\/ #\1 #\. #\1 #\" #\, #\" #\U #\R #\I #\" #\Space #\" #\/ #\h #\a #\n
#\d #\l #\e #\r #\t #\e #\s #\t #\" #\, #\" #\P #\A #\T #\T #\E #\R #\N #\"
#\Space #\" #\/ #\h #\a #\n #\d #\l #\e #\r #\t #\e #\s #\t #\" #\, #\" #\A
#\c #\c #\e #\p #\t #\" #\Space #\" #\* #\/ #\* #\" #\, #\" #\H #\o #\s #\t
#\" #\Space #\" #\l #\o #\c #\a #\l #\h #\o #\s #\t #\Space #\6 #\7 #\6 #\7
#\" #\, #\" #\U #\s #\e #\r #\- #\A #\g #\e #\n #\t #\" #\Space #\" #\c #\u
#\r #\l #\/ #\7 #\. #\2 #\0 #\. #\0 #\  #\( #\i #\4 #\8 #\6 #\- #\p #\c #\-
#\l #\i #\n #\u #\x #\- #\g #\n #\u #\) #\  #\l #\i #\b #\c #\u #\r #\l #\/
#\7 #\. #\2 #\0 #\. #\0 #\  #\O #\p #\e #\n #\S #\S #\L #\/ #\0 #\. #\9 #\.
#\8 #\n #\  #\z #\l #\i #\b #\/ #\1 #\. #\2 #\. #\3 #\. #\4 #\  #\l #\i #\b
#\i #\d #\n #\/ #\1 #\. #\1 #\5 #\  #\l #\i #\b #\s #\s #\h #\2 #\/ #\1 #\.
#\2 #\. #\4 #\" #\} #\, #\0 #\Space #\, #\n #\S #\S #\L #\/ #\0 #\. #\Nul
#\Nul)

Maybe, someone here can explain, why this 0-characters are not recognized as
proper utf-8 ones?

Thanks!
Vsevolod
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/babel-devel/attachments/20100804/f593d78f/attachment.html>