[GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?

Mon Nov 8 03:59:16 UTC 2010

OK, this turned out to be a lot harder than I thought.  I have done
three things:
1) I have defined gref* and (setf gref*) methods specific to each of
the foreign-array types that call cffi:mem-aref.  This gives about 20x
speedup because I am now passing the literal type to cffi:mem-aref, so
it can work that fact in at compile time.
2) There is now a compiler macro to turn a grid:gref into a grid:gref*
if there's only one index.  This gives about 2x speed up when gref is
used.
3) There is another compiler macro that turns a grid:gref* into a
cffi:mem-aref directly if the foreign array is declared.  This gives
about a 400x speedup overall, similar to your "hardwired" result.  It
is a bit slower because I'm not able to precompute the pointer, it has
to be recomputed each time on the gref* call.

On the last point, there is a caveat.  I tried to make it work when
the foreign array has been declared with a standard (declare ...)
form.  This has a chance of working on SBCL because of its support for
the CLtL2 function variable-information, which was removed from CL
before it was sent to ANSI standardization.  However, it did not work
for me; I will continue to try to get this working.  In the meantime,
the only way to do a declaration to take advantage of 3 is with a 'the
form, e.g. (grid:gref (the vector-double-float zvector) i).  This is
kind of annoying, but it is portable, and allows you to avoid going to
lower level functions (i.e., cffi:mem-aref).

So for example see my rewrite of your function (in
foreign-array/tests/fast-array-access.lisp)
(defun gref-access (dim)
  "Given an integer dim, this constructs a function that, when supplied with a
   N-dimensional vector Z and some output vector (-> pointer?), yields the
   corresponding forces"
  (let ((temp-values (make-array 2 :element-type 'double-float
:initial-element 0.0d0)))
    (lambda (zvector output)
      (declare (fixnum dim)
	       (optimize (speed 3) (safety 0) (debug 0))
	       (type vector-double-float zvector)) ;;; <--- this is useless,
but ought not to be!
      (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i))
	(setf (aref temp-values 0) 0.0d0)
	(do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m))
	  (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n))
	    (setf (aref temp-values 1) 0.0d0)
	    (do ((k m (1+ k))) ((> k n)) (declare (fixnum k))
	      (incf (aref temp-values 1) (grid:gref (the vector-double-float
zvector) k))) ; This declaration does the work!
	    (incf (aref temp-values 0) (expt (aref temp-values 1) -2))))
	(setf (grid:gref output i)
	      (- (grid:gref (the vector-double-float zvector) i) ; This one does too!
		 (aref temp-values 0)))))))
is now (almost) as fast as your cffi-access.

There is still a bunch of stuff to be done --- the optimizations only
work for vectors, not higher dimensional arrays, and I haven't defined
a compiler macro for setf yet on 3).  But it's a start; try it in your
problem and let me know how it performs.

Liam