[slime-devel] CMUCL unicode strings breaks slime

Helmut Eller heller at common-lisp.net
Fri Oct 1 15:45:13 UTC 2010


* Raymond Toy [2010-10-01 15:18] writes:

> CMUCL doesn't currently have a codePointCount function, we that's easy
> enough to add if slime wants it.  Here's one:
>
> (defun codepoint-count (string)
>   "Return the number of code points in the string.  The string MUST be
>   a valid UTF-16 string."
>   (do ((len (length string))
>        (index 0 (1+ index))
>        (count 0 (1+ count)))
>       ((>= index len)
>        count)
>     (multiple-value-bind (codepoint wide)
> 	(lisp:codepoint string index)
>       (declare (ignore codepoint))
>       (when wide (incf index)))))

I hope this is faster than it looks :-).

What does read-sequence if the input stream contains surrogate pairs?
Swank uses code like 

  (let* ((buffer (make-string length))
         (count (read-sequence buffer stream)))
    buffer)

where length is the number of code points as computed by Emacs.
If read-sequence also works on code units than we can't send surrogate
pairs from Emacs -> Lisp.

Helmut





More information about the slime-devel mailing list