[elephant-devel] UTF seriazer/desiriali patch

Hiroyuki Komatsu kom at narihara-lab.jp
Wed Aug 5 03:21:06 UTC 2009

From: "Leslie P. Polzer" <sky at viridian-project.de>
Date: Tue, 4 Aug 2009 08:58:03 +0200 (CEST)

>   Could you also add new tests that show the problem?

Below listing is test code, uses attached utf-8 encoded file.
File was constructed by six lines, have below format.
     #\<a-char>:UTF-(8|16|32) char-code:<hex-value of (char-code #\<a-char>)>

-------------------- >8 -- >8 --------------------
(require :elephant)

(use-package :elephant)

(defpclass c ()
  ((l :initarg :l :accessor l :index t)))

(defun test (path)
  (with-open-store (`(:bdb ,(ensure-directories-exist "/var/tmp/test-db/")))
    (print 'x)
    (with-open-file (f path :external-format :utf-8)
      (print 'y)
      (loop for line = (read-line f nil nil)
	    while line
	    do (print (make-instance 'c :l line)))
      (let* ((un-sorted (mapcar #'l (get-instances-by-range 'c 'l nil nil)))
	     (sorted (sort (copy-list un-sorted) #'string<)))
	(if (equal un-sorted sorted)
	    (print "pass")
	    (print "error"))))))

(test #p"path-to-attached-file")
-------------------- >8 -- >8 --------------------

>  Is the change compatible with existing incorrectly sorted
>  databases?

In my experience;
  works correctly.

  GET-INSTACES-BY-RANGE does not work correctly with
  incorrectly sorted data.
-------------- next part --------------
a:UTF-8 char-code:61
x:UTF-8 char-code:78
?:UTF-16 char-code:3042
?:UTF-16 char-code:6F22
?:UTF-32 char-code:2A38C
?:UTF-32 char-code:2A437

More information about the elephant-devel mailing list