[Ecls-list] Speed of indirect calls.

Waldek Hebisch hebisch at math.uni.wroc.pl
Thu Feb 12 02:55:24 UTC 2009

I am trying to make ECL compiled FriCAS work faster.  FriCAS is
doing a lot of indirect calls and speed of such call is significant
to overall speed.  Below is a little benchmake trying to find
out speed of indirect calls.  To run the benchmark store
code in "call-spd.lisp" file and do the following at the
Lisp prompt:

(proclaim '(optimize (speed 3) (safety 0)))
(load (compile-file "call-spd.lisp"))

(time (do_it 10000000))

On 2.2 GHz Core 2 running 64-bit Fedora results are:

clisp: 5.01823s
sbcl:  0.331s
openmcl: 0.3507s
ECL: 1.563s

So ECL results are significantly worse than Lisps which directly
generate machine code.  Given that benchmarrk is doing 6*10^7
calls (6 calls times 10^7 iterations) ECL needs about 50 clocks
per call which is slow.  I was able to get a significant speedup
(to 0.476 sec) replacing definition of 'SPADCALL' macro by the following:

(defmacro ECLCALL2(f x y)
   `(ffi:c-inline (,f ,x ,y) (FFI:OBJECT FFI:OBJECT FFI:OBJECT)
                  "(*((#0)->cfun.entry))(#1, #2)"
                  :one-liner t))

(defmacro SPADCALL (&rest L)
  (let ((args (butlast l)) 
        (fn (car (last l))) 
        (gi (gensym)))
     ;; (values t) indicates a single return value
    `(let ((,gi ,fn))
       (ECLCALL2 (car ,gi) , at args (cdr ,gi))))) 

However, it looks that such definition is unsafe (because bytecoded
functions, closures and CLOS generic functions all use different
calling convention even if the argument list is fixed).

It would be nice if ECL used uniform calling convention for
functions with fixed argument lists making the above optimization
safe (even better, ECL could just generate efficient call).

While I am mostly interested in indirect calls AFAICS the same problem
affects calls between files
Passing enviroment pointer to all functions (both closures and normal
functions) would make calling conventions for normal functions and
closures the same.  Bytecoded functions can have interpreter as main
function and bytcode in the environment.  Similarly for CLOS: the main
function could do dispatch.  In case of normal functions enviroment
pointer adds a little penalty to the call proper, but eliminating call
to 'ecl_apply_from_stack_frame' should more than compensate this.
For best speed on selected platforms ECL could generate machine code
trampolimes for closures and use normal C calling convention.

Extra comment: '(the function ...' form in the SPADCALL macro is to tell
the Lisp compiler that the lisp object really is a function, and that this
functions takes fixed number of arguments (without need for any special
argument processing) and returns single value.  Apparently it has desired
effect on sbcl and openmcl (without this construct sbcl generates much
slower code).


(defmacro SPADCALL (&rest L)
  (let ((args (butlast l)) 
        (fn (car (last l))) 
        (gi (gensym)))
     ;; (values t) indicates a single return value
    `(let ((,gi ,fn)) 
       (the (values t) 
          (the (function ,(make-list (length l) :initial-element t) t) 
            (car ,gi))
          , at args 
          (cdr ,gi))))))

(defun GETREFV (n) (make-array n :initial-element nil))

(defun op0(x d) (setf (svref d 0)
                      (the fixnum 
                            (1+ (the fixnum (svref d 0))))))
(defun op1(x d) (SPADCALL x (svref d 1)))
(defun op2(x d) (SPADCALL x (svref d 2)))

(defparameter d1 (GETREFV 3))
(defparameter d2 (GETREFV 2))
(defparameter d3 (GETREFV 3))

(setf (aref d3 1) (cons #'op0 d1))
(setf (aref d3 2) (cons #'op1 d3))
(setf (aref d1 1) (cons #'op2 d3))
(setf (aref d1 2) (cons #'op1 d1))
(setf (aref d2 0) (cons #'op2 d1))
(setf (aref d1 0) 0)

(defun do_it (n)
   (declare (type fixnum n))
   (dotimes (i n) (SPADCALL 0 (svref d2 0)))
   (aref d1 0))

                              Waldek Hebisch
hebisch at math.uni.wroc.pl 

More information about the ecl-devel mailing list