[elephant-devel] new elephant and unicode troubles
Henrik Hjelte
henrik at evahjelte.com
Sun Feb 25 09:43:13 UTC 2007
Hi Ties! Nothing beats a sunday morning bughunt!
Found the solution for you, see below.
Cheers,
Henrik Hjelte
Add this testcase to testserializer.lisp:
(Sorry the code indentation looks funny when attached to an email)
(deftest hard-strings
(are-not-null
(in-out-equal (format nil "Mot~arhead is a hard rock
band." (code-char 246)))
(in-out-equal (format nil "M~atley cr~ae is a hard string and was a
hard rock band." (code-char 246)
(char-code 252))))
t t)
Had to change serialize-string this in unicode2.lisp to
look like this. Apparantly the term utf8 in this file has nothing at all
to do with utf8, rather it means a string of ascii chars. So
serialize-to-utf8 returns nil when it finds a code>127. Then it should
continue trying with two-byte char strings, which was not done in the
existing cvs version.
(defun serialize-string (string bstream)
"Try to write each format type and bail if code is too big"
(or (serialize-to-utf8 string bstream)
(serialize-to-utf16le string bstream)
(serialize-to-utf32le string bstream)))
Old buggy version:
;;(defun serialize-string (string bstream)
;; "Try to write each format type and bail if code is too big"
;;(declare (type buffer-stream bstream)
;; (type string string))
;; (cond ((and (not (string= "" string)) (< (char-code (char string 0))
#x7F))
;; (serialize-to-utf8 string bstream))
;; ;; Accelerate the common case where a character set is not Latin-1
;; ((and (not (string= "" string)) (< (char-code (char string 0))
#xFFFF))
;; (serialize-to-utf16le string bstream))
;; ;; Actually code pages > 0 are rare; so we can pay an extra cost
;; (t (or (serialize-to-utf8 string bstream)
;; (serialize-to-utf16le string bstream)
;; (serialize-to-utf32le string bstream)))))
On Sun, 2007-02-25 at 00:50 +0100, Ties Stuij wrote:
> with the cvs elephant on sbcl on linux with bdb, with all tests
> passed, the following code:
>
> (defclass crocodile ()
> ((belly :accessor belly-of :initform "järv"))
> (:metaclass persistent-metaclass))
>
> (defparameter *ben* (make-instance 'crocodile))
>
> (belly-of *ben*)
>
> gives:
>
> deserialize of object tagged with 188 failed
>
> as an error, which comes from %deserialize, from deserialize in
> serialize2.lisp. A string with 'safe' characters though is properly
> recognized as utf-8. The 188 can also be 132 or another value. The 6.1
> checkout renders the same result but i must say i did like the error
> message 'deserialize fubar!' more. Ideas?
>
> Greets,
> Ties
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
>
>
More information about the elephant-devel
mailing list