[cxml-devel] Clozure Unicode v.s. cxml-rng

Ben Hyde bhyde at pobox.com
Wed Apr 15 19:00:48 UTC 2009


In clozure code-char returns nil for two ranges:

(loop with in = t for i below 1114110 as c = (code-char i) do (cond  
((and in (not c)) (setf in nil) (format t "~&gap: #x~X" i)) ((and (not  
in) c) (setf in t) (format t " .. #x~X" (1- i) ))))
gap: #xD800 .. #xDFFF
gap: #xFFFE .. #xFFFF
NIL

Searching on phrases like: "Range: D800–DBFF. The High Surrogate Area  
does not contain any character"  "the value FFFE ! is guaranteed not  
to be a. Unicode character at all" will throw some additional light on  
that, and it seems related to this discussion:
<http://thread.gmane.org/gmane.lisp.babel.devel/15>

These cause the file unicode.lisp in cxml-rng to error when  
compiling.  The following let's it compile; but I doubt this is how  
anybody who knew what this code was doing would recommend doing  
this.   It reorders the system's components, so I have the range  
functions from clex available.  It then modifies massage-ranges to cut  
out these gaps from any ranges which pass thru.

But my interest is limited to getting the file to compile so my  
current project, which doesn't need cxml-rng, can get back on track as  
I experiment with ccl.  So I'll admit I haven't tried testing this  
patch at all.


  - ben

bash-3.2$ git diff
diff --git a/cxml-rng.asd b/cxml-rng.asd
index e64adff..64582ca 100644
--- a/cxml-rng.asd
+++ b/cxml-rng.asd
@@ -17,12 +17,12 @@
      :components
      ((:file "package")
       (:file "floats")
+     (:file "clex")
       (:file "unicode")
       (:file "nppcre")
       (:file "types")
       (:file "parse")
       (:file "validate")
       (:file "test")
-     (:file "clex")
       (:file "compact"))
      :depends-on (:cxml :cl-ppcre :yacc :parse-number :cl-base64))
diff --git a/unicode.lisp b/unicode.lisp
index 42b686a..c5ea17f 100644
--- a/unicode.lisp
+++ b/unicode.lisp
@@ -57,6 +57,8 @@
      `(defranges ,name ',ranges)))

  (defun massage-ranges (l)
+  #+ccl (setf l (cxml-clex::ranges- l '((#xd800 #xDFFF)
+                                        (#xFFFE #x10000))))
    (mapcan (lambda (x)
  	    (let ((a (code-char (car x)))
  		  (b (code-char (cadr x))))
bash-3.2$





More information about the cxml-devel mailing list