[cffi-devel] Re: must cffi-sys::with-pointer-to-vector-data go?

Wed Jan 25 19:36:29 UTC 2006

Quoting Hoehle, Joerg-Cyril (Joerg-Cyril.Hoehle at t-systems.com):
> cffi-sys::with-pointer-to-vector-data, as is, is highly problematic.
> Trying to get the base address of a Lisp vector in memory is
> unportable and subject to subtle errors with a moving GC.  I'll try
> and suggest a better API below.
> 
> On CMUCL, it's implementend using sys:without-gcing, which is sign
> enough of a problem.

I think you have argued quite convincingly that WITHOUT-GCING is a very
bad implementation strategy for pinned vectors.

However, I don't think that the concept of WITH-POINTER-TO-VECTOR-DATA
is entirely bogus just because some backends use a bad implementation.

I believe that, on implementations which don't support this idiom
natively, the right thing is to temporarily create a foreign vector and
copy from/to that foreign scratch space in the implementation of the
macro.  That's not as fast as possible, but it is safe and portable.

Note that Java's JNI uses this strategy, too.

However, there is a big difference between JNI and CFFI's current
proposal: For JNI, the primitives in question can be used for *any*
array, even arrays that were created by random user code.  So there is
an equivalent of WITH-POINTER-TO-VECTOR-DATA, but no equivalent of
MAKE-SHARABLE-BYTE-VECTOR is necessary.

And here's why this matters:

> BTW, cl+ssl:stream-read-sequence uses
>                 (replace thing buf :start1 start :end1 (+ start length))
> anyway.  So there's copying even when sharing. [...]

... the problem is that we would *like* to simply pass the user's vector
into the those functions, but we cannot expect the user to have created
a sharable vector.  (So we are currently copying the vector manually
instead, but that's obviously bogus.)

To make the sharable vector interface in CFFI useful for CL+SSL, the
interface would have to be changed so that WITH-POINTER-TO-VECTOR-DATA
is guaranteed to work for any vector.

(It could make sense to keep MAKE-SHARABLE-BYTE-VECTOR, and say that
applications will have a greater chance to actually get non-copying
behaviour if they create their vectors this way.  As you mentioned, ACL
has such vectors.  To make that work with the new interface, there would
have to be same way to look at a vector someone else created, and find
out whether it is sharable or not, so that WITH-POINTER-TO-VECTOR-DATA
could decide at runtime which implementation stragety to use.)

> I believe the design should be the opposite: an efficient copy-in or
> copy-out may resort to with-pointer-to-vector-data and possibly to
> si::without-gcing (is that thread-safe at all?) to quickly copy the
> vector and do nothing more than that (no callbacks, no signals, etc.).
> 
> To return to the CL+SSL example, I suggest to use a foreign-alloc'ed
> buffer for ssl-read etc., then copy that into a Lisp vector.

Well, I would prefer the interface I've explained above.

(Perhaps with the :in and :out arguments you mentioned.)

> I think this is the best one can achieve portably, without resorting
> to very specialised features like IIRC Allegro's ability to allocate
> Lisp vectors at non-moving locations.
[...]

I didn't think about Allegro when I started using this interface.  It's
only now that I realize how Allegro's allocation strategies must have
influenced the current proposal.

What I had in mind was SBCL, which has a macro called
WITH-PINNED-OBJECTS that does exactly what CFFI needs.

So is it a good idea to add an interface to CFFI that would, currently,
be guaranteed to be efficient only some Lisps only?

That's a question the CFFI maintainers would have to decide.  But my
impression was that CFFI, in contrast to UFFI, is willing to implement
features even if not every Lisp supports them.

Perhaps it would influence the decision to know which Lisps there are
that can pin objects in memory.
 * SBCL (only on gencgc ports, but the gencgc porting committee is meant
   to fix that real soon now...  Implementing the macro on
   non-conservative gencgc ports would have to be done a little
   differently, but the page pinning mechanism should in theory work on
   those ports, too.)
 * CMUCL does not have a the macro SBCL offers, but having the same GC,
   it should be trivial to implement.  I guess nobody has done that yet
   because CMUCL doesn't have preemptive threads and therefore less need
   for it.
 * ECL with Boehm GC should have no trouble because object's don't move
   at all (?)
 * CLISP: I guess not, but would it be hard to implement?
 * OpenMCL: don't know
 * The commercial Lisps: Don't know, but they'll implement it if 
   their paying customers start asking for it.  ;-)

d.