[cffi-devel] Road to 0.9.3 (and encoding issues)
Luis Oliveira
luismbo at gmail.com
Tue Apr 17 00:10:33 UTC 2007
Hello,
The cffi-newtypes branch[1] is getting huge (~30 patches), and since it
seems nobody has complained much about the new type system, I'd like to
push it into the main branch along with the encoding support and other
minor features and bugfixes that have accumulated over last ~2 months.
Any objections?
Then, after some remaining issues are fixed, I'd like to release 0.9.3.
So here's my TODO list for 0.9.3:
- <http://article.gmane.org/gmane.lisp.cffi.devel/1033> Figure out
whether any of those issues are critical. The only change I've made
to the typesystem since I wrote that message was a new :class option
to DEFSTRUCT;
- iron out some issues with the new encoding support (see below, this
might take a while);
- finish the documentation of the new features;
- integrate cffi-grovel into the CFFI tree and document it.
There are some issues with the current encoding support though:
- James' original code had a CFFI-SYS:DEFAULT-ENCODING function which
would use some implemention-specific way of determining what the
default encoding should be. Every implementation had a different
way of determining that, so I though that simply picking one
ourselves (say, :utf-8) would be better.
Also, I changed that to be a special variable instead. That way
*DEFAULT-FOREIGN-ENCODING* can be bound to something else and affect
the behaviour of the :string type and other string operations that
don't explicitely specify an encoding. (E.g., the behaviour of
:STRING would be influenced at run-time by the value of *D-F-E*
whereas (:STRING :ENCODING :UTF-8) would not.)
- Allegro's %lisp-string-into-foreign overflows. Allegro's
EXCL:STRING-TO-NATIVE doesn't take a bufsize argument. Another
problem is that Allegro's %lisp-string-octet-length isn't very
effective otherwise it'd be easier to check for an overflow.
Anyway, this is not the worst issue.
- Corman, SCL and ECL support is broken, not necessarily because of
the new string stuff. Also, a recent 1.1 snapshot of OpenMCL with
unicode support is necessary; is this a problem?
Last but definitely not least, there's a huge problem with error
semantics. Everything works fine until you try to, e.g. convert a #\λ
into iso-8859-1. Or #\ç into ascii.
Some Lisps treat ascii as a synonym for iso-8859-1. Some silently
substitute inconvertible characters with #\? (or #\Sub) while others
will raise an exception. Of those that do raise exceptions, some
provide a use-value substitution restart, others don't. To sum it up,
it's horribly inconsistent.
Before forward porting James's cffi-encondings branch, we discussed the
possibility of doing the enconding conversion in portable CL, like what
flexi-streams does for instance. I'm beginning to reconsider this idea.
One the one hand it would mean introducing a dependency for CFFI, on the
other hand it would provide consistent semantics across the various
Lisps (including useful semantics for lisps that don't support unicode)
we support and simplify CFFI-SYS. Any thoughts?
[1] http://common-lisp.net/~loliveira/darcs/cffi-newtypes/
--
Luís Oliveira
http://student.dei.uc.pt/~lmoliv/
More information about the cffi-devel
mailing list