How to fix arnesi:clean-op.

Mon May 9 20:14:23 UTC 2016

On Mon, May 9, 2016 at 8:41 AM, Robert Goldman <rpgoldman at sift.net> wrote:
> I am inclined to agree with Andreas.  I don't see that CLEAN-OP is
> impossible for ASDF.  I can see that there are some issues that require
> thought, and for some systems additional coding by the developer, but I
> don't see that as an insuperable barrier.  After all, "make clean" is in
> some sense impossible (it's generally hand-coded by the programmer), but
> we use it every day.

Many things are possible. My point was that there was no
one-size-fits-all, especially not as ASDF is currently designed.

NB: Bazel, that manages all its output in its cache, and doesn't allow
operations to disable that cache for some outputs like ASDF does, has
a bazel clean that does the notional equivalent of
   rm -rf ~/.cache/common-lisp/
Maybe the solution is to become more like bazel.

> I accept Faré's points about there being challenges to write an ideal
> CLEAN-OP.  But I don't agree that this means that we can't write a good
> enough CLEAN-OP. After all, make manages "make clean."  Yes, it requires
> some effort from the programmer, but who cares?  Is that a reason we
> have to throw up our hands and refuse to do anything?  Here are some
> specific responses to particular issues.
>
> On the other hand, while I don't agree that these make the CLEAN-OP
> impossible, I *do* agree that they are difficult issues. So... I'd be
> happy to see CLEAN-OP added, and I'd be happy to support this effort,
> *BUT* CLEAN-OP patches must be preceded -- by an informal specification
> that describes what an implementation should do, covering these cases.

Yup.

> Now that ASDF is on everyone's critical path, we need to think a little
> before changing it, and not just bash out a hunk of code.

We could possibly make that an extension to ASDF so as not to have to
modify and test ASDF, which is a heavy thing to do... but then, you'd
need to build and load the extension to run it, which would itself
dirty the cache. Of course, the same is true if you use ASDF from
git, anyway.

> Specific responses to Faré's issues:
>
> 1. clean-op is not defined in general.  I believe that this means "do we
> clean the system" or "do we clean the operation" [as in arnesi, whose
> CLEAN-OP has a FOR-OP]?
>
> This seems similar to the notion that Faré already had, of providing a
> BUILD-OP that does "the default build action" for a system (e.g., maybe
> it builds the library?  Maybe it does LOAD-OP? Maybe it does LOAD-OP and
> DOC-OP?).
>
> So one solution would be to have an UNMAKE-OP that does the inverse of
> the build-op, and then have a CLEAN-OP that undoes specific operations
> (like the arnesi CLEAN-OP).  Then the UNMAKE-OP might default to being
> the CLEAN-OP for COMPILE-OP, which would handle probably 90% of systems.
>
> On the other hand, I'm pretty unhappy with the notion of the BUILD-OP,
> because it requires retraining every single user of ASDF to use BUILD-OP
> instead of LOAD-OP.  That would be an enormous user interface fail.  So
> this is not a step to be taken lightly.

Maybe instead of a CLEAN-OP, we could have an unoperate function, or a
clean-system function. It makes sense to have a function that is not
operate because we'll also want to clean timestamp cache entries for
the operations that are undone, i.e. undo the mark-operation-done for
every action the output-files of which have been cleaned. Similarly,
you probably do NOT want to include a hypothetical CLEAN-OP in said
timestamp cache, and the cache is particularly ill-suited for a
CLEAN-OP with a :for-op, :op or :operations option, since the current
cache crucially fails to index such options.

Or maybe we just want a clean-cache function.
And maybe we also want to remove the feature that allows output-files
to not relocate things to the cache.

> 2. I think we can simply make clean-op clean for the current implementation.
>
Sure. But what if some operations actually involve cross-compilation
with other implementations?

> 3. I believe we should mirror the semantics of :FORCE T and :FORCE ALL
> when we unmake.  I.e., unbuild the dependencies only with :FORCE :ALL.
> :FORCE T would actually be a no-op.
>
Makes sense.

> 4.  "How do you deal with
> defsystem-depends-on and other dependencies not explicitly
> included in the plan?"  DEFSYSTEM-DEPENDS-ON is already broken, and will
> be deprecated, except as a checkable declaration (see earlier postings
> on this list for a description of the issues).  For the rest, I'd say
> same as above -- if there's something that won't appear in the build
> plan that one needs to clean, then the programmer must write an
> :IN-ORDER-TO for the CLEAN-OP so that ASDF can know what it needs to do.

Just calling the feature broken and having people do imperatively what
the feature tries to do declaratively (i.e. loading a system from a
.asd file) doesn't make the usage pattern go away. And this usage
pattern means that just to *compute* the plan to clean a particular
system, you'll be *loading* systems and dirtying that cache; and with
the current and foreseeable ASDF (i.e. unless and until the feature is
actually fixed rather than ignored), these loaded systems won't be in
the dependency graph. Thus by trying to clean, you WILL make the cache
actually dirtier than it was. Fixing that the D-S-D feature will also be
instrumental in making :FORCE :ALL anything but a joke.

Finally, there's the issue of whether and if so how to handle
secondary systems. Should they too be cleaned? If not, then :FORCE T
might not be doing much useful when most of the build happens in
secondary systems (especially when using package-inferred-systems).

> 5. "Are you sure there aren't bugs and
> omissions in asdf or any extension whereby some side output file
> is omitted from the list?"
>
> ow does make deal with a file that isn't listed in the rule for "make
> clean"?  You miss it.  Very sorry.  Fix your system definition.  Use
> (:in-order-to (clean-op ....))

Sure, bugs and bugs and need be fixed. But until they are, any clean
function will not be as useful as advertised. And we probably have
many such bugs lurking in extensions, since nobody tested this aspect.

> 6. "What about randomly-named temporary
> files created during an operation that has been interrupted?"  Maybe we
> are doing the Wrong Thing with these.  If they are temporary files,
> maybe it's the wrong thing to put them in the cache and rename them?
> Maybe they should live in /tmp (or equivalent) and be moved from there?
> Then it's not our problem to fix anymore -- that's for tmpwatch.
>
> If you add your own rules that make temporary files, you have to add
> your own rules that remove them.

No, no, you don't get it. ASDF itself already uses temporary files all
over the place, not to mention extensions such as CFFI; and to solve a
concurrency issue, at least ASDF has recently started to include a
random portion in the temporary file names (so that two concurrent
attempts don't clobber each other). So as to hopefully achieve atomic
renaming of temporary files to output files, these temporary files do
NOT reside in /tmp but in the very same directory as the future output
file. So it is NOT sufficient or possible to rely on tmpwatch to
cleanup after ASDF. If ASDF had some *less* configurable output
translations, it could probably put all temporary files in a
subdirectory of "the" output cache, and then use the equivalent of
tmpwatch on that directory. But that's not the case at this point.

> 7. "What if load-op is not the only operation that matters, but
> e.g. doc-op generates LaTeX and PDF outputs but fails to be
> tracked by clean-op? Are you going to use the CLOS MOP to
> generate a list of all possible operations? What if that other
> operation isn't loaded in the current image? What if operation
> options cause different files to be generated? Or features? What
> about all secondary systems? Are you going to generate a plan
> for them, too?"
>
> See the answer to #1:  ASDF is not going to solve this problem.  Either
> the programmer solves it using definitions of PERFORM and IN-ORDER-TO,
> or it's not solved. So sad.  But I don't see any reason to avoid the 90%
> solution because there isn't a 100% solution.

Fine, but I see a clean-op more like a 50% solutions 50% increased problem
than a 90% solution. rm -rf ~/.cache/common-lisp and/or git clean -xfd
are the 90% solutions.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
The fear of death follows from the fear of life. A man who lives fully is
prepared to die at any time. — Mark Twain