[Ecls-list] Looking for ideas to help reduce initialization time in cl_boot()

Thu Oct 29 21:34:10 UTC 2009

2009/10/29 Ram Krishnan <kriyative at gmail.com>:
> I've been using ECL for an embedded application, and it's been working
> very well (thanks to the developers/maintainers for an excellent
> embeddable Lisp system).

Thanks, it is always great to read from a satisfied user!

> However, the initial start up time in cl_boot()
> is quite high (as much as 8-10s, which is high for my deployment
> environment), and I'm looking for suggestions to help reduce this.

Understood. Before commenting the rest of the email, let me explain
how ECL works with respect to compilation and deployment.

ECL, just like any other Lisp environment, compiles to binary files
that have two parts: a set of static data and functions, and then a
list of statements that have to be executed when the file is loaded.
It is crucial to understand this: unlike in C, C++, fortran, etc, the
implementation does not know the final state of the environment until
those statements have been executed, so we cannot save time by
precomputing all those DEFUNs DEFVARs, etc, because there will be
statements in between that will change the outcome of those
DEF-statements themselves. Some might not even get executed at all!

What most implementations do is to load the compiled files once and
then let the user dump a memory image where the side effects
(definition of functions, values of constants, definition of
variables, etc) are stored and made permanent for later use. Launching
a lisp with one of those memory images may be faster or slower
depending on the format (platform independent or not, raw, relocatable
or not, etc)

ECL currently does not provide that. It allows you to link the
compiled files together, let them be executed once at startup and then
do whatever you want with the resulting environment.

> Running under a profiler also confirmed that the bulk of the time is in
> the call to `init_lib_LSP'.

This is interesting. init_lib_LSP is NOT part of the application you
deploy, but of ECL itself. I say that this is interesting because in
normal conditions initializing the lisp library is just 0.4s or less.
On my intel Mac

$ time ecl -eval '(quit)' -norc

real	0m0.148s
user	0m0.109s
sys	0m0.027s

> The obvious approach I though of was to try
> and dump a Lisp image, which could be directly loaded on startup,
> instead of invoking the initialization functions. Portability isn't a
> concern for this application, so I'm willing to accept that. I haven't
> dug into the guts of ECL yet, but I suspect there is more to this than
> simply finding the root of the heap, and writing out all the live
> objects to a file.

I wrote an externalizer for ECL some time ago. The idea is to take a
set of lisp data and create a binary representation for it. Then one
provides functions that reconstruct the data quickly, including any
circular references.

The initial motivation was less ambitious than providing full memory
images, but it was a first step. That smaller goal consisted on
replacing the current printed representation of compiler constants
with a binary one, in order to speed up boot time -- I naïvely thought
that parsing text was slower than reconstructing binary data, but I
was somehow wrong: boot time is to a large extent consumed in creating
data (that is allocating it) and executing statements.

I think that the code still lives in CVS, but somehow it was not
exported to GIT properly. You can checkit out with the branch name
"externalize"
http://ecls.cvs.sourceforge.net/viewvc/ecls/ecl/?pathrev=externalize

It would not take much doing the same thing on a larger scale, that is
dumping all of ECL's data. There is only one difficulty which was not
met by my previous implementation though, namely externalizing C
functions. How do you store the objects representing functions that
have been compiled to C? That involves keeping track of what are the
function pointers and then reconstructing them. That requires some
bookkeeping, which is probably not hard to do.

The other minor detail is that one must NOT change the way ECL
compiles. The externalization must be just an add on: you execute a
program and get a working environment and then have the option of
dumping the memory content to later on execute the _same_ program with
the binary data, thus speeding up the boot time.

Juanjo

-- 
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com