[trivial-utf-8-devel] "Overlong byte sequence found" when encoding/decoding #\U+0800

Anton Vodonosov avodonosov at yandex.ru
Sun Sep 4 23:18:56 UTC 2011


Hello.

When I encode the character #\U+0800 to utf-8 and then decode it back, 
I get "Overlong byte sequence found" condition signaled.

Here is the self-contained test case:

(let* ((str (string (code-char #x0800)))
       (utf-8-bytes (trivial-utf-8:string-to-utf-8-bytes str)))
  
  (trivial-utf-8:utf-8-bytes-to-string utf-8-bytes) ;; <- signals "Overlong byte sequence found".
  
  )

I am using trivial-utf-8-20101006-darcs from the recent Quicklisp.

Best regards,
- Anton




More information about the trivial-utf-8-devel mailing list