[armedbear-devel] Performance of eq hash tables
evenson at panix.com
Thu May 23 05:34:26 UTC 2013
Alan Ruttenberg <alanruttenberg at gmail.com> writes:
> Has anyone looked at this? I'm traversing a large structure and using
> an eq (tried eql too) hash table to record where I've been so that I
> don't get caught in circularities. There are about 1.5M nodes, but
> what I see is that adding entries to the hash table slows down (a
> lot) over time, even when I give an initial size of 4,000,000. That
> would be max occupancy of 30%, lower than anyone would typically
> What I see is that it rapidly starts to fill. At 300k entries I'm
> getting about 2.5K new entries per second. By the time I get to 600k
> entries it is down to 750 new entries/sec. At 900k (where it stands
> now) it's about 500 entries/sec.
> That makes me suspect that the hash isn't distributing well over the
> size of the underlying vector and that the slowdown is due to buckets
> getting long.
> I'm not sure how to probe this - any ideas?
> Or to fix it. I suppose I could look for an alternative hash
> implementation, but I thought I would check first with the list to
> see if anyone else can shed some light.
I haven't actively pushed our CL:HASH-TABLE to such limits, but would be
very interested in including alternative implementations in ABCL if that
solves your problem. [WeakHashTable] provides an example of an alternate
hashtable structure that "looks" like a standard CL:HASH-TABLE to existing
ABCL's [Java implementation of CL:HASH-TABLE][HashTable.java] uses a
pre-allocated array of HashEntry objects for the underlying storage
buckets, so unless there is some sort of memory pressure from GC, I am
surprised by this behavior. Standard questions for your side: Can you
comfortably fit 10^6 of your records in memory? Are you copying the
structure when adding to the HASH-TABLE, or just inserting a reference? I
might start investigating by throwing the "-Xloggc: log GC status to a
file with time stamps" switch to see if the GC "pulses" get longer as you
get deeper into your transverse.
A [ticket in the Trac] is always an easy place to stick such a problem,
so it doesn't get lost (login with OpenID if you need to).
Mark Evenson <evenson at panix.com>
"A screaming comes across the sky. It has happened before, but there is
nothing to compare to it now."
More information about the armedbear-devel