[cxml-devel] suggestion and bug report
Francis Leboutte
f.leboutte at algo.be
Mon Jul 16 07:24:21 UTC 2007
Hello,
Below, a small suggestion and a bug report.
Regards,
Francis
1.
In rod-to-utf8-string, element-type should be
base-char, which is used for strings of
characters having 8-bit encoding in some
implementations (would take less memory in these
implementations - LW, ACL and others maybe)
(defun rod-to-utf8-string (rod)
(let ((out (make-buffer :element-type 'character)))
(runes-to-utf8/adjustable-string out rod (length rod))
out))
2.
While experimenting with the functions
cxml:utf8-string-to-rod and
cxml:rod-to-utf8-string, I found that the string
returned by cxml:utf8-string-to-rod doesn't
include the last input character when this character is not an ascii character.
On LWw 5.0.2, an example with the micro sign:
USER 8 > (cxml:rod-to-utf8-string (format nil "abc~C" (code-char #x03BC)))
"abcμ"
USER 9 > (cxml:utf8-string-to-rod
(cxml:rod-to-utf8-string (format nil "abc~C." (code-char #x03BC))))
"abcµ."
USER 10 > (cxml:utf8-string-to-rod
(cxml:rod-to-utf8-string (format nil "abc~C" (code-char #x03BC))))
"abc"
The problem seems to come from the method
(defmethod runes-encoding:decode-sequence ((encoding (eql :utf-8)) ...) ...)
where the tests
(< (%+ rptr ...) in-end)
should probably be
(<= (%+ rptr ...) in-end)
instead.
--
Francis Leboutte
Algorithme, Rue de la Charrette 141, 4130 Tilff (Esneux), Belgique
Service en informatique
f.leboutte at algo.be
www.algo.be
+32-(0)4.388.3919
More information about the cxml-devel
mailing list