[slime-devel] CMUCL unicode strings breaks slime

Raymond Toy toy.raymond at gmail.com
Sat Oct 9 16:59:16 UTC 2010


On 10/6/10 6:59 PM, Raymond Toy wrote:
> On 10/6/10 11:51 AM, Helmut Eller wrote:
>> * Raymond Toy [2010-10-06 12:27] writes:
>>
>>>> For now option 2) is probably the simplest. 
>>>
>>> Ok.  Can you give some hints on where to start looking at this?
>>
>> read-message and write-message in swank-rpc.lisp.
> 
> The following change seems to work.  I can now create a string with
> characters outside the bmp and slime doesn't die.
> 
> CL-USER> (map 'string #'code-char '(55296 56320 81 82 83))
> "𐀀QRS"
> 
> I don't know what the "proper" way to integrate this would be, though.
> I'll need help with that.

Here is an updated change. It uses definterface/defimplementation to
implement this.

Is this the right way to do this?  I tested this with cmucl and ccl, and
slime does the right thing with cmucl without breaking ccl.

If this is not the right way, please let me know.  Otherwise, I'll check
this in soon.

Ray

Index: ChangeLog
===================================================================
RCS file: /project/slime/cvsroot/slime/ChangeLog,v
retrieving revision 1.2142
diff -u -r1.2142 ChangeLog
--- ChangeLog	20 Sep 2010 16:09:13 -0000	1.2142
+++ ChangeLog	9 Oct 2010 16:54:52 -0000
@@ -1,3 +1,16 @@
+2010-10-09  Raymond Toy  <toy.raymond at gmail.com>
+
+	* swank-cmucl.lisp (codepoint-length): Implement codepoint-length
+	to return the number of codepoints in cmucl's utf-16 strings.
+
+	* swank-backend.lisp (:swank-backend): Export codepoint-length.
+	(codepoint-length): definterface codepoint-length.  Default is to
+	use LENGTH.
+
+	* swank-rpc.lisp (write-message): Call
+	swank-backend:codepoint-length to get the correct length for
+	emacs.
+
 2010-09-20  Stas Boukarev  <stassats at gmail.com>

 	* swank-cmucl.lisp (character-completion-set): Implement. Requires
Index: swank-rpc.lisp
===================================================================
RCS file: /project/slime/cvsroot/slime/swank-rpc.lisp,v
retrieving revision 1.6
diff -u -r1.6 swank-rpc.lisp
--- swank-rpc.lisp	14 Apr 2010 17:51:30 -0000	1.6
+++ swank-rpc.lisp	9 Oct 2010 16:48:13 -0000
@@ -92,7 +92,7 @@

 (defun write-message (message package stream)
   (let* ((string (prin1-to-string-for-emacs message package))
-         (length (length string)))
+         (length (swank-backend:codepoint-length string)))
     (let ((*print-pretty* nil))
       (format stream "~6,'0x" length))
     (write-string string stream)
Index: swank-backend.lisp
===================================================================
RCS file: /project/slime/cvsroot/slime/swank-backend.lisp,v
retrieving revision 1.201
diff -u -r1.201 swank-backend.lisp
--- swank-backend.lisp	18 Sep 2010 09:34:05 -0000	1.201
+++ swank-backend.lisp	9 Oct 2010 16:49:04 -0000
@@ -43,7 +43,8 @@
            #:emacs-inspect
            #:label-value-line
            #:label-value-line*
-           #:with-symbol))
+           #:with-symbol)
+  (:export #:codepoint-length))

 (defpackage :swank-mop
   (:use)
@@ -1317,3 +1318,12 @@
   "Request saving a heap image to the file FILENAME.
 RESTART-FUNCTION, if non-nil, should be called when the image is loaded.
 COMPLETION-FUNCTION, if non-nil, should be called after saving the image.")
+
+;;; Codepoint length
+
+(definterface codepoint-length (string)
+  "Return the number of codepoints in the string.  With some Lisps
+  like cmucl, LENGTH returns the number of UTF-16 code units, but
+  other Lisps return the number of codeponts. The slime protocol
+  wants string lengths in terms of codepoints."
+  (length string))
Index: swank-cmucl.lisp
===================================================================
RCS file: /project/slime/cvsroot/slime/swank-cmucl.lisp,v
retrieving revision 1.231
diff -u -r1.231 swank-cmucl.lisp
--- swank-cmucl.lisp	20 Sep 2010 16:09:13 -0000	1.231
+++ swank-cmucl.lisp	9 Oct 2010 16:50:49 -0000
@@ -2576,3 +2576,16 @@
     (loop for n in names
        when (funcall matchp prefix n)
        collect n)))
+
+(defimplementation codepoint-length (string)
+  "Return the number of code points in the string.  The string MUST be
+  a valid UTF-16 string."
+  (do ((len (length string))
+       (index 0 (1+ index))
+       (count 0 (1+ count)))
+      ((>= index len)
+       count)
+    (multiple-value-bind (codepoint wide)
+	(lisp:codepoint string index)
+      (declare (ignore codepoint))
+      (when wide (incf index)))))





More information about the slime-devel mailing list