[Ecls-list] computing results into preallocated bignums

Nils Bruin nbruin at cecm.sfu.ca
Wed Sep 9 15:57:39 UTC 2009


OK, I redid the timings for the fibo example, without the time related 
commands in "test" and using the time macro and results are much more 
consistent. All tests were in a fresh ecl session: I only pasted in the 
required "defun"s and the "compile"s when required. Results seem quite 
stable now. I did 3 consecutive runs in each of the 4 sessions. Times are:

Register code, uncompiled: 15.9 sec
Register code, compiled: 9.1 sec

Prealloc code, uncompiled: 14.8 sec
Prealloc code, compiled: 8.8 sec

So the gain is something like 6% or 3% for this very addition intensive 
example. Quite a bit smaller than I expected, but in line with what I 
measured initially. I find it odd that bytecompiled seems to benefit more 
than c compiled. You are not optimizing any bignum computation in 
"compile" are you?

Some remarks:
  - this example computed the 4000(+-1)th fibonacci number, which has about 
836 digits, or about 2777 bits. On 64 bit machine, I think this does not 
quite trigger register reallocation upon freeing yet (that would happen 
when the register hits 32*3*64=6144 bits). Indeed "cons"-ing seems to be 
on par. Is that what's measured by "cons"ed?

That means that the penalty we're measuring here is solely from the 
memcpys and that the difference in computation time with ridiculously 
large integers could be bigger. That doesn't sound like a scenario to 
optimize for, though, so the present benchmark is perhaps a fair one.

  - Juanjo made a good point: You don't want to do this if the result is 
likely to fit in a fixnum. Hence, the right thing to do is probably to 
keep the register code in (fixnum,fixnum) operations (although the spare 
bits mean that for addition, that one can be done in a long anyway!). I'd 
think that "cancellation" is relatively rare, so when a bignum comes into 
play, I'd think you're making a good guess by assuming that the answer 
will be a bignum again. Any heuristics on the percentage of multiprecision 
operations that end up producing a small integer in "real life"?

  - backwards compatibility: should be good. You could just copy MPIR's 
macro definitions and use those if the macros are not defined. The 
length assumptions are currently documented in the mpz_array_init 
documentation of GMP 4.3.1 (which is binary compatible with 4.x and 3.x)

  - memory usage: the present code may allocate too much memory for a 
bignum (in fact, almost always will), especially when there is much 
cancellation (bit still a bignum result). It may be possible to let the 
memory manager "shrink" an allocated block (i.e., free the last few bytes 
of a block) relatively cheaply.

Full results:

-----------------------------------------
(defun fibo (n)
     (let ((a 1)(b 1)(c 0))
     (loop for i from 1 to n do
       (setq c a)
       (setq a (+ a b))
       (setq b c))
     b))

(defun test (n m)
      (loop for i from 1 to m do
        (fibo n)))
------------------------------------------

//uncompiled, old code
> (time (test 4000 4000))

real time : 15.892 secs
run time  : 15.882 secs
gc count  : 1380 times
consed    : 2277370957856 bytes
NIL
> (time (test 4000 4000))

real time : 15.920 secs
run time  : 15.908 secs
gc count  : 1380 times
consed    : 6802985308016 bytes
NIL
> (time (test 4000 4000))

real time : 15.946 secs
run time  : 15.937 secs
gc count  : 1378 times
consed    : 11312546734160 bytes
NIL

//compiled, old code
> (time (test 4000 4000))

real time : 9.157 secs
run time  : 9.144 secs
gc count  : 896 times
consed    : 1483250487536 bytes
NIL
> (time (test 4000 4000))

real time : 9.149 secs
run time  : 9.134 secs
gc count  : 896 times
consed    : 4420817410864 bytes
NIL
> (time (test 4000 4000))

real time : 9.154 secs
run time  : 9.143 secs
gc count  : 896 times
consed    : 7358387430192 bytes
NIL
>
7931560345736



//uncompiled, new code
> (time (test 4000 4000))

real time : 14.870 secs
run time  : 14.848 secs
gc count  : 1432 times
consed    : 2452083663360 bytes
NIL
> (time (test 4000 4000))

real time : 14.856 secs
run time  : 14.838 secs
gc count  : 1432 times
consed    : 7323742296272 bytes
NIL
> (time (test 4000 4000))

real time : 14.847 secs
run time  : 14.834 secs
gc count  : 1429 times
consed    : 12169306108296 bytes
NIL

//compiled, new code

> (time (test 4000 4000))

real time : 8.827 secs
run time  : 8.811 secs
gc count  : 930 times
consed    : 1596897759552 bytes
NIL
> (time (test 4000 4000))

real time : 8.824 secs
run time  : 8.811 secs
gc count  : 930 times
consed    : 4759916280672 bytes
NIL
> (time (test 4000 4000))

real time : 8.805 secs
run time  : 8.791 secs
gc count  : 931 times
consed    : 7931560345736 bytes
NIL





More information about the ecl-devel mailing list