[Asdf-devel] startup times and initialize-source-registry

Faré fahree at gmail.com
Thu Aug 21 00:36:12 UTC 2014


At ILC 2014, one discussed show-stopper for using CL as a scripting
language was startup time. Indeed, right now, when used as a script
rather than as a dumped image, CL takes a lot of time to start:

time ( sbcl --noinform --eval '(require :asdf)' --eval '(progn
(asdf:initialize-source-registry) (uiop:writeln (hash-table-count
asdf::*source-registry*)) (uiop:quit))' )
711
( sbcl --noinform --eval '(require :asdf)' --eval ; )  0.66s user
0.17s system 99% cpu 0.832 total

time cl '(hash-table-count asdf::*source-registry*)'
711
cl '(hash-table-count asdf::*source-registry*)'  1.20s user 0.25s
system 99% cpu 1.456 total

That's because it will recursively walk all the directories under the
registered source-registry trees, and there can be a lot of them. Ben
Hyde tells me it's much worse on his machine.

The two following slightly incompatible changes divide that startup
time by three, and promises to divide it further if people adhere to
some discipline.

  (defun collect-sub*directories (directory collectp recursep collector)
    "Given a DIRECTORY, call-function the COLLECTOR function designator
on the directory if COLLECTP returns true when CALL-FUNCTION'ed with
the directory,
and recurse each of its subdirectories on which the RECURSEP returns
true when CALL-FUNCTION'ed with them."
    (when (call-function collectp directory)
      (call-function collector directory)
      (dolist (subdir (subdirectories directory))
        (when (call-function recursep subdir)
          (collect-sub*directories subdir collectp recursep collector)))))

This nests the dolist into the when, which is backward compatible as
far as uiop and asdf internal usage is concerned, but not as far as
other users might be concerned; however, the collectp function is a
bit redundant and useless without this nesting.

  (defun collect-sub*directories-asd-files
      (directory &key (exclude *default-source-registry-exclusions*)
collect (stop-at-asd t))
    (collect-sub*directories
     directory
     #'(lambda (dir)
         (let ((asds (directory-asd-files dir)))
           (map () collect asds)
           (not (and asds stop-at-asd))))
       #'(lambda (x)
           (not (member (car (last (pathname-directory x))) exclude
:test #'equal)))
       (constantly nil)))

The trick here is in this new stop-at-asd flag, which here defaults to
t and isn't configurable, but which should default to nil and be
configurable, for backward compatibility. Its effect is that recursing
into subdirectories stops if a .asd file is found in the toplevel
directory. This saves a lot of recursing, and would save even more if
a .asd file of symlink to one exists at the top of a git hierarchy.
But this is incompatible with a lot of existing code, and so the
transition will be long and painful if this is adopted.

With these changes, I get:

time ( sbcl --noinform --eval '(require :asdf)' --eval '(progn
(asdf:initialize-source-registry) (uiop:writeln (hash-table-count
asdf::*source-registry*)) (uiop:quit))' )
534
( sbcl --noinform --eval '(require :asdf)' --eval ; )  0.24s user
0.05s system 99% cpu 0.293 total

time cl '(hash-table-count asdf::*source-registry*)'
534
cl '(hash-table-count asdf::*source-registry*)'  0.54s user 0.13s
system 99% cpu 0.665 total

That's much better timewise (about 3x speedup), but it's obviously
missing a lot of .asd files. To recover them, I had to:

for i in */ ; do ( cd $i ; setopt NULL_GLOB ; A=( */**/*.asd ) ; echo
$i $#A ; if [ $#A -gt 0 ] ; then ln -s $A . ; fi ) ; done |&tee /tmp/a

Then I get:

time ( sbcl --noinform --eval '(require :asdf)' --eval '(progn
(asdf:initialize-source-registry) (uiop:writeln (hash-table-count
asdf::*source-registry*)) (uiop:quit))' )
711
( sbcl --noinform --eval '(require :asdf)' --eval ; )  0.24s user
0.09s system 99% cpu 0.335 total

time cl '(hash-table-count asdf::*source-registry*)'
711
cl '(hash-table-count asdf::*source-registry*)'  0.50s user 0.07s
system 100% cpu 0.567 total

And... oops, I realize I failed to do the scripting at the SLIME REPL.
Mea culpa.

Of course, if you want instant startup without any search, you can
eschew ASDF2 style autoconfiguration, and go the sysadmin way of
ASDF1. I still think there is value in combining autoconfiguration
with somewhat faster startup time than we have now.

Of course, implementing such a plan with a multi-year
backward-compatible migration strategy is the prerogative of the
current and future maintainers, if they wish to undertake it: it would
take implementing (and testing) the code, but disabling it by default,
enabled with suitable (backward-incompatible) flags in configuration
files. Then, while waiting for all implementations to eventually adopt
the new release, pushing users to reform the way they layout their
directories (and maybe adopt package-inferred-system, while we're at
it — it could help systems that have a lot of one-file subsystems.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
As for poverty, no one need be ashamed to admit it: the real shame is
in not taking practical measures to escape from it. — Perikles




More information about the asdf-devel mailing list