From avodonosov at yandex.ru  Sun Sep  4 23:18:56 2011
From: avodonosov at yandex.ru (Anton Vodonosov)
Date: Mon, 05 Sep 2011 03:18:56 +0400
Subject: [trivial-utf-8-devel] "Overlong byte sequence found" when
	encoding/decoding #\U+0800
Message-ID: <578881315178336@web30.yandex.ru>

Hello.

When I encode the character #\U+0800 to utf-8 and then decode it back, 
I get "Overlong byte sequence found" condition signaled.

Here is the self-contained test case:

(let* ((str (string (code-char #x0800)))
       (utf-8-bytes (trivial-utf-8:string-to-utf-8-bytes str)))
  
  (trivial-utf-8:utf-8-bytes-to-string utf-8-bytes) ;; <- signals "Overlong byte sequence found".
  
  )

I am using trivial-utf-8-20101006-darcs from the recent Quicklisp.

Best regards,
- Anton


From marijnh at gmail.com  Wed Sep  7 08:46:30 2011
From: marijnh at gmail.com (Marijn Haverbeke)
Date: Wed, 7 Sep 2011 10:46:30 +0200
Subject: [trivial-utf-8-devel] "Overlong byte sequence found" when
 encoding/decoding #\U+0800
In-Reply-To: <578881315178336@web30.yandex.ru>
References: <578881315178336@web30.yandex.ru>
Message-ID: <CAJnHWXuvHGPS8Lcu8jLwVF4CxqYWH7DwA41V=qf_rAYPT8GheQ@mail.gmail.com>

Hello Anton,

There was a > used where >= was needed (#x800 is the biggest character
to fit in 3 bytes). I've pushed a fix to the darcs repository.

Best,
Marijn