[asdf-devel] Faster loading for deployed systems?

Tue Apr 24 09:38:35 UTC 2012

I thought I'd report back briefly on some results.

First, I spent some time attempting to implement "perform with session
state". This is the idea that when working with ASDF, it would be useful to
be able to invoke or create specific behavior for my use case and also to
be able to capture work session state without relying on globals. This was
partially successful, but I broke something and ran out of patience. ASDF
is some wonderful code, but takes a bit to grok it all.

So I took the crude approach of just installing a global and using it to
accumulate loaded files from within a perform. I created an .asd file which
defined the dependencies I wanted the image to have. The loaded it with
load-system. Then at the end of load-system I inserted code to save a
structure containing a list of the *defined-systems* as well as a list of
the loaded files (in order of loading).

The next step was to create a little loader script which loads all of the
files, and then populates *modules* via satisfies. There were a couple of
glitches. For one, a couple of systems had hard coded asdf dependencies.
After removing them, they compiled fine. Then another system required
asdf-system-connections. I needed to remove that out of the file load list
because it isn't necessary after things are loaded.

Finally the load list works, the image saves and starts just fine. I can
use regular ol' REQUIRE to provide a sanity check, although I realize this
may not be the best approach, at least it works without extra code.

This saved about 10% on the image size compared to asdf-loading the same
systems, and cleared out some unnecessary dependencies. It will take a
while to work out the best "batteries included" image. I can always load
ASDF for a given project if it needs to incorporate other systems, smaller
projects can just load themselves.

Of course for development I'll still be using ASDF a hundred times a day!

Erik.

On Mon, Apr 23, 2012 at 2:10 PM, Robert Goldman <rpgoldman at sift.info> wrote:

> On 4/23/12 Apr 23 -3:52 PM, Erik Pearson wrote:
> > Hi Robert,
> > What I was thinking is that OPERATE assembles the plan via TRAVERSE,
> > then executes it via PERFORM-PLAN, which calls PERFORM to translate the
> > abstraction of operation + component into an action with side effect. It
> > seems to me that it is each PERFORM (or, rather, the ones that we are
> > interested in) that needs to be captured, since it is only in the body
> > of the PERFORM method that the abstractions are made concrete in into
> > lisp forms which carry out the desired actions.
> >
> > What I am referring to as "recording" is to append the lisp forms
> > produced by each PERFORM into a list or other structure. At present this
> > structure would need to be a global, but ideally it would be a slot in a
> > session object that threads through each PERFORM. It is because the
> > PERFORMS are carried out serially by PERFORM-PLAN according to the plan
> > assembled by TRAVERSE that they can be "played back" to recreate the
> > actions of this asdf session. Now if there were parallel executions of
> > PERFORMs by PERFORM-PLAN, that would be different...
> >
> > I hope that makes sense.
>
> I suppose, but I don't know how you break into what PERFORM does.
>

A quick and dirty way, as I described above, does require breaking into
PERFORM. But I think you could either create an alternate method and inject
that at around the PERFORM-PLAN level, or use a SESSION object as an
additional specializer argument to open up other possible PERFORM
implementations.

>
> I was thinking you would precompute and cache the plan, and then invoke
> PERFORM-PLAN on that.
>

Or maybe another method altogether to take the place of perform, like
SIMULATE-PLAN or RECORD-PLAN, or ASSEMBLE-PLAN.

> That assumes that the call to TRAVERSE is actually expensive enough that
> this is worth the trouble.  I think you could confirm or disconfirm that
> claim quickly with some timing experiments.
>

Yes, I think I did that -- but with a different concern. When I was
concerned about ASDF mucking around inspecting files when I was doing a
REQUIRE, I added a feature to cause a fully loaded module to be flagged as
loaded via mark-operation-done, so that do-traverse would stop traversing
at such a module. It was about 2.5 seconds to traverse normally, and about
0.5 with this optimization for a require after the first one (of course for
a deployed system not development.)

In any case load time is not my only interest.  It is also a desire to be
able to use ASDF more effectively for more use cases.

Erik.

cheers,
> r
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/asdf-devel/attachments/20120424/9dabd2d0/attachment.html>