[armedbear-devel] Performance of eq hash tables

Mark Evenson evenson at panix.com
Thu May 23 05:34:26 UTC 2013

Alan Ruttenberg <alanruttenberg at gmail.com> writes:

> Hi,
> Has anyone looked at this? I'm traversing a large structure and using
> an eq (tried eql too) hash table to record where I've been so that I
> don't get caught in circularities. There are about 1.5M nodes, but
> what I see is that adding entries to the hash table slows down (a
> lot) over time, even when I give an initial size of 4,000,000. That
> would be max occupancy of 30%, lower than anyone would typically
> rehash.
> What I see is that it rapidly starts to fill. At 300k entries I'm
> getting about 2.5K new entries per second. By the time I get to 600k
> entries it is down to 750 new entries/sec. At 900k (where it stands
> now) it's about 500 entries/sec.
> That makes me suspect that the hash isn't distributing well over the
> size of the underlying vector and that the slowdown is due to buckets
> getting long. 
> I'm not sure how to probe this - any ideas? 
> Or to fix it. I suppose I could look for an alternative hash
> implementation, but I thought I would check first with the list to
> see if anyone else can shed some light.

I haven't actively pushed our CL:HASH-TABLE to such limits, but would be
very interested in including alternative implementations in ABCL if that
solves your problem.  [WeakHashTable][] provides an example of an alternate
hashtable structure that "looks" like a standard CL:HASH-TABLE to existing

[WeakHashTable]: http://lisp.not.org/trac/armedbear/browser/trunk/abcl/src/org/armedbear/lisp/WeakHashTable.java

ABCL's [Java implementation of CL:HASH-TABLE][HashTable.java] uses a
pre-allocated array of HashEntry objects for the underlying storage
buckets, so unless there is some sort of memory pressure from GC, I am
surprised by this behavior.  Standard questions for your side: Can you
comfortably fit 10^6 of your records in memory?  Are you copying the
structure when adding to the HASH-TABLE, or just inserting a reference?  I
might start investigating by throwing the "-Xloggc: log GC status to a
file with time stamps" switch to see if the GC "pulses" get longer as you
get deeper into your transverse.

[HashTable.java] http://lisp.not.org/trac/armedbear/browser/trunk/abcl/src/org/armedbear/lisp/HashTable.java

A [ticket in the Trac][1] is always an easy place to stick such a problem,
so it doesn't get lost (login with OpenID if you need to).

[1]: http://lisp.not.org/trac/armedbear/report/1

Mark Evenson <evenson at panix.com>

"A screaming comes across the sky.  It has happened before, but there is
nothing to compare to it now."

More information about the armedbear-devel mailing list