# [Ecls-list] Name of init function.

Waldek Hebisch hebisch at math.uni.wroc.pl
Sat May 31 00:37:25 UTC 2008

```Juan Jose Garcia-Ripoll wrote:
> On Sat, May 31, 2008 at 12:42 AM, Waldek Hebisch
> <hebisch at math.uni.wroc.pl> wrote:
> > You mean standard Lisp hash function?  This does not look like
> > good choice.  Namely, such hash function is used to implement
> > hash tables and hash tables still work even if there is a lot
> > of collisions.  But for init function we need unique names...
>
> The fact that the hash function we use is used for table lookups does
> not mean it is bad. The implementation is here:
> http://burtleburtle.net/bob/hash/doobs.html It has both a 32 bits and
> 64 bits variant. Now, that means one collision in every 2^32 cases or
> 2^64. If you add the build time, that should decrease the probability.
>

Well, this function looks good.  But note:

1) Probablity of collision (assuming uniform distribution) is of order
n^2/(2*m) where n is number of names we need and m is range of hash
function.  If n=10000 and m=2^32, then probability is bigger than
1/100, so in prolonged use we will almost surely get a collision.
IMHO this shows that 32 bit hash is definitly too short.
Now, 10000 object files is a lot, but reasonable for large programs,
I certainly would feel better using system that has no problem
handling 10^5 or 10^6 files.
2) Even with lower probability of failure if system has many users
_some_ users may hit the problem.  I feel that we should make
probability of failure significantly smaller than probability
of hardware failure.  Which means something like 10^(-15) or
10^(-18).  Given that square of number of names enters probability
I would say that 64 bits is still to small a number.  128 may
be enough.
3) If you have non-uniform distribution than probability of collision
is significantly higher.  So I am affraid that 108 bits with
current (improved) hash function is still too little.

--
Waldek Hebisch
hebisch at math.uni.wroc.pl

```