operation.lisp

Thu Sep 22 12:53:10 UTC 2016

On Wed, Sep 21, 2016 at 11:37 PM, Robert Goldman <rpgoldman at sift.net> wrote:
> I was pretty surprised to read this in the latest draft of operation.lisp:
>
> ;; A memoizing way of creating instances of operation.
> ;; All operations MUST created through this function.
>    (defun make-operation (operation-class &rest initargs)
>     "This function creates and memoizes an instance of OPERATION-CLASS...."
>
> This was pretty surprising to me, and is a change to the contract we
> make with programmers. I have certainly written a lot of code that calls
> MAKE-INSTANCE on operation classes.
>
> Are you requiring this because you want to rely on EQ-testing? Or....?
>
> If we want to require the use of MAKE-OPERATION, and rely on the
> operations being interned,  we should do something to make it more
> apparent, and not have things quietly do something wrong, if this
> assumption is violated.
>
> E.g., we could have MAKE-OPERATION do something detectable before
> invoking MAKE-INSTANCE on an operation type (e.g., it could pass
> :make-operation t to MAKE-INSTANCE with the other initargs).  Then if
> MAKE-INSTANCE is called on an operation class *without* the magical
> marker from MAKE-OPERATION, we catch it (in a :BEFORE method?) and we
> throw an error, and indicate the need to call MAKE-OPERATION.
>
> Reasonable-p?
>
A-ha! You're touching one of the least understood part of ASDF
internals -- as witnessed by my recent interaction with Daniel
Kochmanski on gitlab !13 & !16.

Summary: it's a huge can of worms, and doing the Right Thing™ while
preserving operation initargs
is both very hard and not backward-compatible, and ultimately useless
to existing use cases.

A bit of context: ASDF, since the very beginning, stores action
timestamps in hash-tables EQL-indexed by the operation class _name_
(i.e. a symbol), stored in the slot operation-times of each component.
In particular means that as far as ASDF goes about deciding that an
operation has already been done and its outputs can be reused, all
operations of the same class are considered the same. If there are
material differences in the semantics of operations depending on their
initarg, you lose: ASDF will make the computation only once, cache and
reuse the wrong result.

At the same time, ASDF 1 had a weird mechanism whereby operations
inherit any initarg from the operation used to compute dependencies
from, resulting in an explosion of operation objects, with the
initargs of the parent, all the way to the operation instance passed
by OPERATE to TRAVERSE. The initial intention was probably for these
initargs to carry semantic information for in actions keyed by these
operations. But as we saw above, ASDF can't actually discriminate
between objects of the same class.

Thus the passing around of operation initargs was costly yet not
functional, for there were only ever two users, both misguided: swank
and asdf-ecl.
*   Swank.asd was using operation-forced to determine whether to
actuall reload swank; operation-forced is computed from the force flag
initially passed to OPERATE; this is very wrong, and it should always
consider that operation-forced is T (hence one of my proposed cleanups
to both swank and asdf).
*   asdf-ecl was a heroic kluge from the ASDF 1 day to build static or
dymamic libraries or standalone executables, and was using
operation-original-initargs to pass around some C compiler flags to
the final action that links the program or library. All of asdf-ecl is
completely misdesigned if the goal is fix ASDF to properly handle
bundles: the correct design is what we now have in ASDF 3, where any
flags pertaining to a specific program are stored in the component for
that program, of class program-system. But asdf-ecl wasn't meant to
fix ASDF; it was meant to work despite a completely broken ASDF 1,
that no one had hopes to ever see un-broken. It cut many corners, and
passing information through the existing operation initargs rather
than touch the ASDF system class hierarchy with a ten foot pole was
probably a clever choice at the time. It just is completely wrong
thing in the context of ASDF 3.

Note that it is not enough to "simply" use objects in an EQUAL
hash-table rather than their class name in a EQL hash-table: CLOS
objects are stateful and are only EQUAL if they are EQ; therefore
every time you create a new object, whether in OPERATE or traversing
the results of COMPONENT-DEPENDS-ON, you'd get new objects that are
never EQUAL to any other object, and you never ever find anything in
your hash-tables.

One solution would be to add a normalization protocol, so that any two
operations built with the same initargs *up to some normalization*
would have the same object. Then you could key the hash-tables with
objects. That's what I conceived MAKE-OPERATION for. You could even
survive the fact that so much user code at large calls MAKE-INSTANCE
on operations (as advertised by the ASDF 1 documentation itself): each
operation would have a slot to contain the normalized operation,
whether lazily set by the accessor or eagerly set by some
shared-initialize method. And you'd use this slot as hash-table key.
Equivalently, you could have some initarg normalization return a list
of values comparable with EQUAL, that could be used as keys in an
EQUAL hash-table.

Unhappily, this solution was not compatible with legacy user code: now
you'd have to convince everyone to define their normalization
functions whenever they use initargs in operations; and you'd have to
teach them how to do this normalization correctly, despite its many
subtle constraints that are especially hard to fulfill in an
incremental way that allows extensions to peacefully coexist. What
more, since most people are using the existing LOAD-OP and COMPILE-OP
with their many extensions, you'd still need to put component-specific
information in COMPONENT objects, and LOAD-OP and COMPILE-OP would
have to use whatever default normalization method is defined by ASDF
itself, that can't possibly know what extensions would like to put in
those objects and how to normalize it. Finally, if you somehow want to
preserve flags like :force for backward compatibility with swank, you
find out that including the flag in the normalized key prevents
sharing of the timestamp cache between forced and non-forced which
defeats the purpose of the cache, whereas excluding it leads to the
same explosion in operation objects that you wanted to avoid, and/or
you then need two kinds of normalization with subtle invariants that
all users must maintain (e.g. the normalized key for the inner
normalization must be a suffix of the normalized key for the outer
normalization). Overall, a big losing battle that I didn't want to
fight, even less so as my repeated requests to get swank fixed have
been met with contempt for several years (NB: my latest !23 contains a
workaround that makes operation-forced always return T, fixing
swank.asd *despite* its maintainers, since we can't get its
maintainers to fix it).

At some point, I really wanted to make rich operations with
non-trivial initargs happen. I saw it as enabling ASDF to become more
expressive than Make with flags, yet correct unlike using Make with
flags. But then, I had to fight so many people on so many fronts to
get other much tamer semantic changes accepted during the transitions
from ASDF 1 to ASDF 2 to ASDF 3 that I lost the energy necessary to
pursue each and every person who currently uses MAKE-INSTANCE for ASDF
operations, and tell them to use MAKE-OPERATION instead. Since the
ASDF1 documentation even recommended using MAKE-INSTANCE 'LOAD-OP in
various contexts, this would have been a long and protracted battle.
And then, there was the need to have everyone write one or two proper
normalization functions for their new operation classes. Eventually, I
decided that there was a huge downside and little upside to fighting
this battle: obviously never had been anyone who actually relied on
operation initargs for anything meaningful (since it wouldn't and
couldn't have worked), except asdf-ecl using it as a dirty
side-channel to pass information to just one action. The
expensive-to-achieve new model would therefore have been to achieve a
feature no one had ever needed or requested before, to completely fix
a bug only I was aware of.

In conclusion, the comments in that file are the correct future-proof
recommendation for those who want to write good code that works best
with how ASDF is today and will still work whichever way ASDF evolves
in the future. This should include at least all ASDF maintainers, and
maintainers of serious ASDF extensions. At this point, I'm not
fighting to have people outside ASDF follow this best practice,
though. I've given up fighting. I've given up on passing meaningful
initargs to operation. People have been doing without for over 14
years now. Maybe we should just make that official: there won't be any
meaningful instance-allocated slots for operations, ever.

I should probably put that in the manual somewhere (or will you?)

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
Who is a true communist?
He who faithfully reads the works of Marx and Lenin.

Who is a true anti-communist?
He who actually understands them!