[Ecls-list] Consing statistics?

Dave Roberts dave at vyatta.com
Thu Dec 28 18:02:19 UTC 2006


Juan Jose Garcia-Ripoll wrote:
> 2006/12/24, Dave Roberts <dave at vyatta.com>:
>> I'm not sure I understand the difference? The other patch is only
>> "inaccurate" if the timed form allocates more than 4 GB of memory.
> 
> The current patch uses bignums to store the amount of reserved memory,
> if required. On every garbage collection it increments a counter which
> is a lisp integer adding the amount of bytes which were allocated
> between garbage collections.
> 
> Your patch is not safe if the function allocates more than 4GB within
> the TIME call, or if the C counter wraps around within the TIME
> routine.

Not quite. The only problem is if the function allocates more than 4 GB 
in the timed function (size_t, actually, which on scales with machine 
word size; 32-bit or 64-bit). The patch actually handles wrap-around of 
the C counter (from 2^32-1 back to 0). It just can't tell if it wrapped 
back past the starting point. Thus, if it sees a start of 4 and an end 
value of 5, it assumes 1 byte allocated, not 1+4GB. But if it sees a 
start of 4 and an ending value of 3, it assumes 4GB-1 byte.

>> Otherwise, it is accurate. Are you just keeping more than 32-bit
>> precision somewhere, modifying the local copy of the Boehm code?
> 
> No changes to the Boehm code yet. Before every garbage collection we
> inspect a variable (GC_words_allocd) that counts the number of words
> allocated since the last garbage collection. This is added to a lisp
> integer, which at some point may become a bignum.

Ah, gotcha. That would work. So the bytes allocated is the sum of the 
running bignum plus the current GC_words_allocd ? I guess you're relying 
on the C counter not overflowing between GCs, but if it's a size_t sized 
object, you'd have to allocate all of memory to overflow, which would 
surely trigger a GC. I'd just think through things on a 64-bit machine 
and make sure there isn't any strange overflow that could happen there.

> Since creating bignums itself "conses", the counter is reset when
> entering a timed piece of code, and deactivated by setting
> cl_object.bytes_consed = OBJNULL afterwards. This is, in my opinion,
> better than what SBCL does: updates a bignum on every GC.

It would be nice to leave this around all the time. While getting the 
statistic in TIME is useful, it can be useful at other times, too. 
Fiddling with bignums does cons, but it does so only a small amount, no, 
and this would only happen at every GC and not every allocation, right? 
IMO, that isn't a big efficiency hit. The GC itself is bumping the fixed 
C counters with every allocation. That's already the biggest hit per 
allocation. Everything else is noise. And if I'm doing consing just 
before a GC, it's because I was already doing a lot of consing. Unless 
you think it's particularly painful to leave this enabled all the time, 
I'd save yourself the trouble of the enable/disable code and leave the 
allocation tracking open to a potentially wider range of uses.

>> Is there any reason you removed the count of GCs, which I also find
>> helpful? The consing is helpful to determine if a form is generating a
>> lot of memory.
> 
> The reason I did not add a GC count is because I wanted to make sure
> this works first. Now it is possible to add other statistics to the GC
> entry point and I will add also a hook to the garbage collector for
> including a timer.

Sounds great.

-- Dave




More information about the ecl-devel mailing list