[GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?

Wed Oct 27 03:22:06 UTC 2010

Sebastian,

Can you temporarily define this and find the timing/consing for your
test case:

(defmethod gref* ((object vector-double-float) linearized-index)
  (cffi:mem-aref
   (foreign-pointer object)
   :double
   linearized-index))

(I think you don't use any matrices but if you do, define an analogous
function for matrix-double-float.)

As you can see, it has the literal type declaration, and I'm hopeful
that CFFI will pick that up and make this competitive in speed with
the best that you saw.  If that's so, it should be fairly easy for me
to make this generic and incorporate it into GSD.  I'm still
interested in making the linearization more efficient if that's still
significant, but let's try this for now to see how much speed we can
squeeze out of gref*.

Thanks,

Liam

On Tue, Oct 26, 2010 at 10:25 AM, Sebastian Sturm
<Sebastian.Sturm at itp.uni-leipzig.de> wrote:
> It seems that CFFI includes some compiler macros that use type information
> supplied at compile time to generate more efficient code (got that from the
> cffi mailing
> list, http://www.mail-archive.com/cffi-devel@common-lisp.net/msg01154.html).
> In my case, I'm using this optimization by supplying :double to
> cffi:mem-aref. If I replace this by (cl-cffi (element-type zvector)), as is
> done internally by gref, then (again with dim = 50), better-force-function
> uses around 1.8 GCycles and conses 80 MB in the process, whereas the :double
> version needs ~ 8.6 MCycles, not consing anything. The slow-but-flexible
> version of better-force-function reads as follows:
> (defun better-force-function (dim)
>   "Given an integer dim, this constructs a function that, when supplied with
> a
>    N-dimensional vector Z and some output vector (-> pointer?), yields the
>    corresponding forces"
>   (declare (fixnum dim))
>   (let ((temp-values (make-array 2 :element-type 'double-float
> :initial-element 0.0d0)))
>     (lambda (zvector output)
>       (let ((zvector-fptr (grid::foreign-pointer zvector))
>    (output-fptr (grid::foreign-pointer output))
>    ;; this makes it worse
>    (elt-type (grid:cl-cffi (grid:element-type zvector)))
>    )
> (macrolet ((quick-ref (the-vector n)
>     `(cffi:mem-aref
>       ,(case the-vector
>  (zvector 'zvector-fptr)
>  (output 'output-fptr))
>      ;;  :double
>       elt-type ;; replace this by :double
>       ,n)))
>  (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i))
>    (setf (aref temp-values 0) 0.0d0)
>    (do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m))
>      (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n))
> (setf (aref temp-values 1) 0.0d0)
> (do ((k m (1+ k))) ((> k n)) (declare (fixnum k))
>  (incf (aref temp-values 1) (quick-ref zvector k))) ;; generates efficiency
> warnings when using elt-type
> (incf (aref temp-values 0) (expt (aref temp-values 1) -2))))
>    (setf (quick-ref output i)
>  (- (quick-ref zvector i)
>     (aref temp-values 0)))))))))
> Also, with the variable type left unspecified at compile time, the innermost
> loop generates efficiency warnings telling me that generic-+ needs to be
> used. Writing (the double-float (quick-ref zvector k)) removes these and
> slightly reduces the consing amount of the slow variant to ~ 63 MB. I still
> have to try the SLIME profiler though.
> thanks,
> Sebastian