[cxml-devel] suggestion and bug report

Francis Leboutte f.leboutte at algo.be
Mon Jul 16 07:24:21 UTC 2007


Hello,

Below, a small suggestion and a bug report.

Regards,

Francis

1.
In rod-to-utf8-string, element-type should be 
base-char, which is used for strings of 
characters having 8-bit encoding in some 
implementations (would take less memory in these 
implementations - LW, ACL and others maybe)

(defun rod-to-utf8-string (rod)
   (let ((out (make-buffer :element-type 'character)))
     (runes-to-utf8/adjustable-string out rod (length rod))
     out))

2.
While experimenting with the functions 
cxml:utf8-string-to-rod and 
cxml:rod-to-utf8-string, I found that the string 
returned by cxml:utf8-string-to-rod doesn't 
include the last input character when this character is not an ascii character.

On LWw 5.0.2, an example with the micro sign:

USER 8 > (cxml:rod-to-utf8-string (format nil "abc~C" (code-char #x03BC)))
"abcμ"

USER 9 > (cxml:utf8-string-to-rod
           (cxml:rod-to-utf8-string (format nil "abc~C." (code-char #x03BC))))
"abcµ."

USER 10 > (cxml:utf8-string-to-rod
           (cxml:rod-to-utf8-string (format nil "abc~C" (code-char #x03BC))))
"abc"

The problem seems to come from the method
(defmethod runes-encoding:decode-sequence ((encoding (eql :utf-8)) ...) ...)
where the tests
(< (%+ rptr ...) in-end)
should probably be
(<= (%+ rptr ...) in-end)
instead.


--
Francis Leboutte
Algorithme, Rue de la Charrette 141, 4130 Tilff (Esneux), Belgique
Service en informatique
   f.leboutte at algo.be
   www.algo.be
   +32-(0)4.388.3919





More information about the cxml-devel mailing list