[Ecls-list] About name collisions

William Robinson airbaggins at gmail.com
Fri Feb 1 01:10:48 UTC 2008


Juan Jose Garcia-Ripoll wrote:
> Hi,
>
> I have just finished (but not yet committed) new code to change the
> names of initialization functions in compiled code.
>
> Until now we used init_**** where **** contains the file name of the
> source. I propose to use
>   (format nil "_ecl~36Rs~36Rt~36R" (get-universal-time) (incf
> *counter*) (si::getpid))
> where *counter* is an internal counter, and we use the universal time
> and the process id to differentiate the binary.
>
> Pros: simple, works for any source tree organization and any number of
> compilations on the same machine; it will still work when the compiler
> becomes threads safe and reentrant.
>
> Contras: there may be collisions if the computer clock is wrong, or of
> one tries to combine binaries built on two different machines.
>
> I would like to hear your opinion, but keep in mind that function
> names cannot be arbitrarily large, so hostnames, threads, etc,
>
> Juanjo
>
>   
Well, I'm not really an expert with respect to this. As far as I can 
make out from the C99 spec, at least, one can safely use 63 characters 
for internal symbols and 31 for external.

My instinct would be to use a hash (md5, or something simpler) of the 
system-name, module-path and filename. But, I don't know if this init_ 
function needs a stable name or not, anyway. Are .a files of libraries 
always recompiled each time a compile is done? Would a .o file be 
effectively invalidated every time you start a build because you would 
lose a handle on its init_ function?

I was playing with an (effectively base-63) encoder for just C 
identifier characters (usable only as a suffix, as it may start with a 
numerical). A 16-byte integer (md5-length) turns out as a 22-char string.

(defparameter +c-ident-chars+ 
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_")

(defun encode-c-ident (i)
  (if (zerop i)
      "0"
      (nreverse
       (with-output-to-string (out) 
         (loop while (plusp i)
            do
            (multiple-value-bind (next val) (floor i (length 
+c-ident-chars+))
              (write-char (aref +c-ident-chars+ val) out)
              (setf i next)))))))

CL-USER> (encode-c-ident (expt 2 (* 8 16)))
"5zMPpBRJYpUbb3SX3lFml4"
CL-USER> (length (encode-c-ident (expt 2 (* 8 16))))
22

Well, maybe that function would be useful anyway, for keeping 
identifiers small.

Ciao.
Bill
:)







More information about the ecl-devel mailing list