Multiple processing compiling the same file

Faré fahree at gmail.com
Wed Jan 31 04:57:41 UTC 2018


(Sorry for delayed response)

>>>: Jim Newton
>>> If I run several sbcl processes on different nodes in my compute cluster, it might happen that
>>> two different runs notice the same file needs to be recompiled (via asdf),
>>> and they might try to compile it at the same time.  What is the best way to prevent this?
>>>
You mean that this machines share the same host directory? Interesting.

"Normal" rules of ASDF compile to a temporary file and rename the
output at the end,
thus providing some kind of race resistance. But for
backward-compatibility reasons,
this requires every extension to manually follow a protocol for ASDF
to remain robust.

>>> I see in the asdf documentation that there is an asdf:*user-cache* variable whose
>>> value is the path name of the directory where asdf compiles into.
>>> Would it be advisable for my to arrange so that asdf:*user-cache*
>>> is a function of the pid and hostname and perhaps thread-id (if such a thing exists) to avoid such collisions?
>>>
That's an option. It is expensive, though: it means no sharing of fasl
files between hosts. If you have cluster of 200 machines, that means
200x the disk space.

What about instead building your application as an executable and
delivering that to the cluster?

My rule of thumb is that there is one home directory per human, and
the human is only interactively building one thing at a time (and/or
can set several accounts and/or $HOME variants for as many
"personalities"). Thus you only need one fasl cache for interactive
compilation. If you want non-interactive deployment, use tools like
bazel, nix, etc., to build your software deterministically.

>>> Or is there some better way to handle this which is build into asdf?
>
You can have different ASDF_OUTPUT_TRANSLATIONS or
asdf:*output-translations-parameter*
on each machine, or you can indeed have the user cache depend on
uiop:hostname and more.

The Right Thing™ is still to build and test then deploy, rather than
deploy then build.
Using Bazel, you might even be able to build in parallel on your cluster.

>>: pjb
>> I had requested that ASDF includes the hostname (or machine-instance), in
>> the built path for the cache.
>> Unfortunately, for some reason, the maintainers of ASDF thought it was a
>> good read to remove it.
>> There you are!
I still think it's a bad idea. If your $HOME is shared by many
machines, you probably want what's in $HOME to be shared, too. Go
build in /var/tmp or use Bazel or whatever. Or use uiop:hostname in
your ASDF configuration.

On Tue, Jan 23, 2018 at 7:51 AM, Jim Newton <jnewton at lrde.epita.fr> wrote:
> Apparently, this approach seems to work.   I’m not sure if it is the best
> approach.
> Here is what my code looks like.  It creates a directory in /tmp/ and
> asdf:load-system
> seems to compile the .fasl files into there.
>
>
> (require :asdf)
> (require :sb-posix)
> (let ((home (directory-namestring (user-homedir-pathname)))
>       (uid (sb-posix:getuid))
>       (pid  (sb-posix:getpid)))
>   (setf asdf::*user-cache* (ensure-directories-exist (format nil
> "/tmp~A~D/~D/" home uid pid))))
>
I still don't understand why your use case uses deploy-then-build
rather than build-then-deploy.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
A child of five would understand this. Send someone to fetch a child of five.
— Groucho Marx



More information about the asdf-devel mailing list