[cffi-devel] a thought on string encodings
James Bielman
jamesjb at jamesjb.com
Mon Jan 2 11:47:09 UTC 2006
On Thu, 2005-12-22 at 18:50 +0100, Hoehle, Joerg-Cyril wrote:
> I hope encoding stuff will be the next great addition to CFFI. Here's some vague idea I once had.
> I got the impression that there are (at least) two types of functions:
> - one where the conversion depends on whatever dynamic calling context
> - another where the conversion is fixed, i.e. depends on the function only, not on the caller (but possibly on the library).
>
> Given CFFI's post transformers, I suspect that there's an opportunity to model both kinds of functions, i.e.
> - some where defcfun expands to defaults of custom:*foreign-encoding* (in CLISP speak)
> - some where the wrappers within defcfun impose a given encoding, e.g. ASCII, ISO-8859-1, UTF-8 or UTF16.
Hi Jörg,
I've started thinking about this. To demonstrate the new type
translator interface, I'm working on (to begin with), a UTF8-STRING type
which converts Lisp strings to/from UTF-8 on Unicode Lisps.
I want to implement this efficiently in CLISP, so I want to be sure I
use optimized C primitives as much as possible. I think I have a fairly
efficient method for conversion to a foreign string:
#+clisp
(defmethod translate-to-foreign ((s string) (name (eql 'utf8-string)))
(ffi:with-foreign-string (ptr chars bytes s :encoding charset:utf-8)
(declare (ignore chars))
(let ((buf (foreign-alloc :unsigned-char :count bytes)))
(memcpy buf ptr bytes)
(values buf t))))
(where memcpy just calls the C function of the same name)
I didn't see any interface in CLISP to convert a Lisp string to a
pointer that didn't stack-allocate, but this should still be pretty
fast. (Does the CLISP FFI provide something like memcpy?)
However, I haven't been able to find an inverse for
FFI:WITH-FOREIGN-STRING. I'd like to be able to convert a pointer back
to a Lisp string without looping in bytecode to create a vector of
octets from the pointer.
So, I think I need that block interface we've talked about. I tried a
whole bunch of combinations of FFI:MEMORY-AS with FFI:C-ARRAY-PTR types
and got nothing but segfaults. Is there something I can use to convert
the pointer to either a vector of octets (which I can pass to
EXT:CONVERT-STRING-FROM-BYTES, or to a Lisp string directly?
Thanks,
James
More information about the cffi-devel
mailing list