[Ecls-list] computing results into preallocated bignums
Nils Bruin
nbruin at cecm.sfu.ca
Wed Sep 9 15:57:39 UTC 2009
OK, I redid the timings for the fibo example, without the time related
commands in "test" and using the time macro and results are much more
consistent. All tests were in a fresh ecl session: I only pasted in the
required "defun"s and the "compile"s when required. Results seem quite
stable now. I did 3 consecutive runs in each of the 4 sessions. Times are:
Register code, uncompiled: 15.9 sec
Register code, compiled: 9.1 sec
Prealloc code, uncompiled: 14.8 sec
Prealloc code, compiled: 8.8 sec
So the gain is something like 6% or 3% for this very addition intensive
example. Quite a bit smaller than I expected, but in line with what I
measured initially. I find it odd that bytecompiled seems to benefit more
than c compiled. You are not optimizing any bignum computation in
"compile" are you?
Some remarks:
- this example computed the 4000(+-1)th fibonacci number, which has about
836 digits, or about 2777 bits. On 64 bit machine, I think this does not
quite trigger register reallocation upon freeing yet (that would happen
when the register hits 32*3*64=6144 bits). Indeed "cons"-ing seems to be
on par. Is that what's measured by "cons"ed?
That means that the penalty we're measuring here is solely from the
memcpys and that the difference in computation time with ridiculously
large integers could be bigger. That doesn't sound like a scenario to
optimize for, though, so the present benchmark is perhaps a fair one.
- Juanjo made a good point: You don't want to do this if the result is
likely to fit in a fixnum. Hence, the right thing to do is probably to
keep the register code in (fixnum,fixnum) operations (although the spare
bits mean that for addition, that one can be done in a long anyway!). I'd
think that "cancellation" is relatively rare, so when a bignum comes into
play, I'd think you're making a good guess by assuming that the answer
will be a bignum again. Any heuristics on the percentage of multiprecision
operations that end up producing a small integer in "real life"?
- backwards compatibility: should be good. You could just copy MPIR's
macro definitions and use those if the macros are not defined. The
length assumptions are currently documented in the mpz_array_init
documentation of GMP 4.3.1 (which is binary compatible with 4.x and 3.x)
- memory usage: the present code may allocate too much memory for a
bignum (in fact, almost always will), especially when there is much
cancellation (bit still a bignum result). It may be possible to let the
memory manager "shrink" an allocated block (i.e., free the last few bytes
of a block) relatively cheaply.
Full results:
-----------------------------------------
(defun fibo (n)
(let ((a 1)(b 1)(c 0))
(loop for i from 1 to n do
(setq c a)
(setq a (+ a b))
(setq b c))
b))
(defun test (n m)
(loop for i from 1 to m do
(fibo n)))
------------------------------------------
//uncompiled, old code
> (time (test 4000 4000))
real time : 15.892 secs
run time : 15.882 secs
gc count : 1380 times
consed : 2277370957856 bytes
NIL
> (time (test 4000 4000))
real time : 15.920 secs
run time : 15.908 secs
gc count : 1380 times
consed : 6802985308016 bytes
NIL
> (time (test 4000 4000))
real time : 15.946 secs
run time : 15.937 secs
gc count : 1378 times
consed : 11312546734160 bytes
NIL
//compiled, old code
> (time (test 4000 4000))
real time : 9.157 secs
run time : 9.144 secs
gc count : 896 times
consed : 1483250487536 bytes
NIL
> (time (test 4000 4000))
real time : 9.149 secs
run time : 9.134 secs
gc count : 896 times
consed : 4420817410864 bytes
NIL
> (time (test 4000 4000))
real time : 9.154 secs
run time : 9.143 secs
gc count : 896 times
consed : 7358387430192 bytes
NIL
>
7931560345736
//uncompiled, new code
> (time (test 4000 4000))
real time : 14.870 secs
run time : 14.848 secs
gc count : 1432 times
consed : 2452083663360 bytes
NIL
> (time (test 4000 4000))
real time : 14.856 secs
run time : 14.838 secs
gc count : 1432 times
consed : 7323742296272 bytes
NIL
> (time (test 4000 4000))
real time : 14.847 secs
run time : 14.834 secs
gc count : 1429 times
consed : 12169306108296 bytes
NIL
//compiled, new code
> (time (test 4000 4000))
real time : 8.827 secs
run time : 8.811 secs
gc count : 930 times
consed : 1596897759552 bytes
NIL
> (time (test 4000 4000))
real time : 8.824 secs
run time : 8.811 secs
gc count : 930 times
consed : 4759916280672 bytes
NIL
> (time (test 4000 4000))
real time : 8.805 secs
run time : 8.791 secs
gc count : 931 times
consed : 7931560345736 bytes
NIL
More information about the ecl-devel
mailing list