[drakma-devel] drakma and non-ASCII content

Fri Mar 9 15:24:44 UTC 2012

A co-worker of mine had some problems today using Drakma to POST a STRING
containing non-ASCII characters encoded as UTF-8.  He was doing something
equivalent to:

(http-request "http://zappa.com/favicon.ico"
              :method :post
              :content (concatenate 'string
                                    "hello" (string #\white_square) "world")
              :external-format-out :utf-8)

which ends up setting the Content-Length header value to 11, which is the
LENGTH in characters of the string being sent.

The documentation says that "if content is a sequence, Drakma will use LENGTH
to determine its length and will use the result for the 'Content-Length' header
sent to the server," so Drakma is working as documented.

Also, I think I understand why this behavior is the default.  You don't want to
scan a content string in order to determine what its length will be in octets
once it has been encoded.  My co-worker could have used a vector of type
(unsigned-byte 8) to hold his UTF-8 encoded data.

However, I think the current Drakma behavior may be a mistake.  People who want
high performance are probably manipulating encoded strings as vectors of
(unsigned-byte 8).  It's the casual users, those sending strings, who are the
ones most likely to be bitten by the default behavior, but only when their
strings contain non-ASCII characters.

bob