<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Faré, <div class=""><br class=""></div><div class="">Thanks for taking the time to understand my comments. I’ve tried to respond to some</div><div class="">of your questions below. Sorry if my original post wasn’t explicit enough to give enough</div><div class="">explanation for what I’m trying to do.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><blockquote type="cite" class="">If I run several sbcl processes on different nodes in my compute cluster, it might happen that<br class="">two different runs notice the same file needs to be recompiled (via asdf),<br class="">and they might try to compile it at the same time. What is the best way to prevent this?<br class=""><br class=""></blockquote></blockquote></blockquote>You mean that this machines share the same host directory? Interesting.<br class=""><br class=""></div></div></blockquote><div><br class=""></div><div>Yes, the cluster shares some disk, and shares home directory. And I believe two cores</div><div>on the same physical host share the /tmp, but I’m not 100% sure about that.</div><div><br class=""></div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><br class=""></blockquote></blockquote></blockquote>That's an option. It is expensive, though: it means no sharing of fasl<br class="">files between hosts. If you have cluster of 200 machines, that means<br class="">200x the disk space.<br class=""></div></div></blockquote><div><br class=""></div><div><div class="">With regard to the question of efficient reuse of fasl files: this is completely irrelevant for my case. My</div><div class="">code takes hours (10 to 12 hours worst case) to run, but only 20 seconds (or less) to compile. I’m very happy to completely</div><div class="">remove the fasl files and regenerate them before each 10 hour run. (note to self: I need to double check that</div><div class="">I do in fact delete the fasl files every time.) Besides, my current flow allows my simply to git-check-in a change, and</div><div class="">re-lauch the code on the cluster in batch. I don’t really want to add an error-prone manual local-build-and-deploy step</div><div class="">if that can be avoided, unless of course there is some great advantage to that approach.</div></div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class="">What about instead building your application as an executable and<br class="">delivering that to the cluster?<br class=""></div></div></blockquote><div><br class=""></div><div><div class="">One difficulty about your build-then-deliver suggestion is that my local machine is running mac-os, and the cluster is</div><div class="">running linux. I don’t think I can build linux executables on my mac. </div></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" class=""><br class=""></blockquote>You can have different ASDF_OUTPUT_TRANSLATIONS or<br class="">asdf:*output-translations-parameter*<br class="">on each machine, or you can indeed have the user cache depend on<br class="">uiop:hostname and more.<br class=""><br class=""></div></div></blockquote><div><br class=""></div><div>This is what I’ve ended up doing. And it seems to work. Here is the code</div><div>I have inserted into all my scripts.</div><div><br class=""></div><div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">(</span><span style="font-variant-ligatures: no-common-ligatures; color: #d03cff" class="">let</span><span style="font-variant-ligatures: no-common-ligatures" class=""> ((home (directory-namestring (user-homedir-pathname)))</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> (uid (sb-posix:getuid))</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> (pid (sb-posix:getpid)))</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> (setf asdf::*user-cache* (ensure-directories-exist (format nil </span><span style="font-variant-ligatures: no-common-ligatures; color: #af3782" class="">"/tmp~A~D/~D/"</span><span style="font-variant-ligatures: no-common-ligatures" class=""> home uid pid))))</span></div><div class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><br class=""></span></div></div><div><br class=""></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div class="">The Right Thing™ is still to build and test then deploy, rather than<br class="">deploy then build.<br class=""></div></div></blockquote><div><br class=""></div><div>In response to your suggestion about build then deploy. This seems very dangerous and error prone to me.<div class="">For example,what if different hosts want to run the same source code but with different optimization settings? </div><div class="">This is a real possibility, as some of my processes are running with profiling (debug 3) and collecting profiling results,</div><div class="">and others are running super optimized (speed 3) code to try to find the fastest something-or-other. </div><div class=""><br class=""></div><div class="">I don’t even know whether it is possible create the .asd files so that changing a optimization declaration will trigger</div><div class="">everything depending on it to be recompiled. And If I think i’ve written my .asd files as such, how would I know</div><div class="">whether they are really correct? </div></div><div><br class=""></div><div>It is not the case currently, but may very well be in the future that I want different jobs in the cluster running different</div><div>git branches of my code code. That would be a nightmare to manage if I try to share fasl files.</div><br class=""><blockquote type="cite" class=""><div class=""><div class="">Using Bazel, you might even be able to build in parallel on your cluster.<br class=""></div></div></blockquote><div><br class=""></div><div>Basel sounds interesting, but I don’t really see the advantage of building in parallel when it only</div><div>takes a few seconds to build, but half a day to execute.</div><div><div class=""><br class=""></div></div><blockquote type="cite" class=""><div class=""><div class="">I still don't understand why your use case uses deploy-then-build<br class="">rather than build-then-deploy.<br class=""></div></div></blockquote></div><br class=""></div><div class=""><br class=""></div><div class="">I hope it is now clear why I can’t. (1) local machine is mac-os while cluster is linux </div><div class="">(2) different jobs in cluster are using different optimization settings. (3) future enhancement</div><div class="">to have different cluster nodes running different branches of the code.</div><div class=""><br class=""></div><div class="">Kind regards</div><div class="">Jim</div></body></html>