[cl-rdbms-devel] hu.dwim.rdbms.oracle-utf-problems-8

Attila Lendvai attila.lendvai at gmail.com
Wed Nov 3 23:26:47 UTC 2010


>>in short: you need to convert between lisp strings and byte arrays,
>>sending/receiving byte arrays to/from the database. the encoding
>>needed for OCI, oracle's C interface, is utf-16.
>>
>
> Is getting strings in utf-16 at this point a feature or a bug?


you don't get utf-16 strings. you get lisp strings (which consists of
characters, and whose encoding is implementation detail), which have
been constructed in a bogus way. so, yes, it's a bug (or
misconfiguration).

but setting up oracle is still a pita, my efforts have failed so far.
will play with it a bit more tomorrow.


> Asked another way: Am I supposed to convert utf-16 strings to lisp strings
> or should that already be in there?


the byte array -> lisp strings conversion should have been done
properly. (it's been done in a bogus way).


> On the other hand, this being a feature would makes sense when thinking
> about where the data is supposed to be seen,
> the major web browsers. Assuming the major web browsers support utf-16.


again, the API of hu.dwim.rfbms should return lisp strings, which have
no encoding (besides the implementation detail you can't see without
looking at the sources of your lisp vm).


> But then, how do I go about manipulation utf-16 string data with sbcl.
> Do I have to take a harder look at babel or flexi-streams?


hu.dwim.rdbms should have decoded utf-16 into lisp strings properly.
you, as a user of it, have nothing to do. as a developer, it's a bug
and/or misconfiguration to be found.


>>utf-8 is nowhere in the picture (if not the encoding emacs/slime uses
>>to communicate with the cl process).
>>
>
> Isn't sbcl able to use utf-8 to represent its lisp strings?


it is able to *export* its strings into byte arrays using various
encodings. but it's a very different thing from using utf-8 to
represent lisp strings.


> Babel isn't able to convert between utf-16 and utf-8, yet? Confusion reigns
> ...
> ORACLE> (let ((octet-array (make-array 6 :element-type '(unsigned-byte 8)
>                        :initial-contents (vector #X66 #X00 #X6F #X00 #X6F
> #X00))))
>       (babel:octets-to-string octet-array :encoding :utf-8))
> "f@^o@^o^@"
>


babel can do these, and only these conversions:

byte vector -> lisp string
lisp string -> byte vector

in the above you feed in an utf-16 byte vector and tell babel to
decode it as utf-8.

-- 
 attila




More information about the cl-rdbms-devel mailing list