[slime-devel] New wire format
Helmut Eller
heller at common-lisp.net
Mon Nov 7 08:11:50 UTC 2011
* Hugo Duncan [2011-11-07 04:04] writes:
> On Sun, 06 Nov 2011 12:13:07 -0500, Helmut Eller
> <heller at common-lisp.net> wrote:
>
>> Counting characters was problematic, especially with Lisps that use
>> UTF16 internally (Allegro, CMUCL, JVM based Lisps). Emacs counts the
>> length of strings in Unicode code points, while in UTF16 a single code
>> point may occupy either 1 or 2 indexes (code units) and so CL:LENGTH may
>> return something different as Emacs expected. For the same reason we
>> can't use READ-SEQUENCE to read a specified number of code points.
>>
>> The new format looks so:
>>
>> | byte0 | 3 bytes length |
>> | ... payload ... |
>>
>> The 3 bytes length header specify the length of the payload in bytes.
>
> Is there a reason to start using a binary encoding of the message
> length?
No deep reason. We actually used binary encoding before we used
hex-strings. That worked fine with latin-1 but not with utf-8. I guess
it's just instinct; now that we explicitly work on a byte stream it's
even more natural. Should probably have used network byte order.
> This makes the messages less easy to inspect, and less easy
> to write integration tests for.
Only marginally. Shifting 3 bytes together is not exactly rocket since.
>> The playload is an s-exp encoded as UTF8 text.
>
> Normalising on utf-8 and counting bytes sounds like it would solve the
> original issue without changing to a binary encoding of the message
> length.
Right. It would not be backward compatible, tho.
Helmut
More information about the slime-devel
mailing list