[asdf-devel] Logical pathnames vs ASDF & SBCL

Faré fahree at gmail.com
Sun Jun 12 00:25:18 UTC 2011


TL;DR: ASDF currently doesn't play so well with logical pathnames.
I'm seeking advice on how to make things better —
or on whether it's worth the trouble.


Dear all,

Pascal's recent plight with logical pathnames on SBCL
once again rises the issue of how much ASDF (or SBCL)
should support logical pathname users, and
how they should behave to make logical pathname users happy.

With case-sensitivity issues, things only get worse!

(Note: personally inviting James Anderson to the discussion, since
he's the only strong proponent and user of logical pathnames I know.)


REMINDER ABOUT LOGICAL PATHNAMES (LPNs)

So, the CLHS says LPNs are case-converted to upper-case,
portably restricted to such upper-case characters,
and get translated to physical pathnames
in an implementation-dependent way.
On Unix, with its case-sensitive filesystems and lower-case conventions,
the useful thing is then downcasing names.

At least SBCL and CLISP follow both the standard
and the useful pathname mapping convention.
SBCL goes on to length to throw an error
when non-portable code tries to break
the standard restrictions on logical-pathnames;
CLISP will let you manually make-pathname a LPN
with lower-case or mixed-case characters, but
will still downcase them at the end
(the latest CLISP 2.49 also seems to not want to create
a logical pathname with parse-namestring).
CCL tries to make users happy by leaving the logical pathname host
case-insensitive (per the standard) but case-preserving,
and being case-sensitive in the rest of the pathname
(if you want upper-case, you know where to find it).
To collect and analyze the behavior of each of 14 implementations
is left as an exercise to the logical-pathname loving reader [M20].
To devise, document and get implementations to adopt a common behavior
that would be universally accepted by all is left as another exercise [50].


ASDF NAMING OF SYSTEMS

Similarly, when naming an ASDF system, to allow
to designate systems with symbols (CL legacy), which are case-converted,
yet map them to a unixy lower-case name, and still
allow systems with all kind of cases,
symbols are downcased whereas strings preserved.
Then the filesystem is queried with the resulting normalized string name
(see function coerce-name).

Now with its good old central-registry,
ASDF used to query each directory every time a system was requested.
This works great with logical pathnames, as it defers
all case handling to the last minute, and allows .asd files
to be loaded with a LPN as the *load-pathname*.
However, this was somewhat slow, and
it didn't scale to recursively searching directories,
and so came the habit of creating "link farms",
single directories with plenty of symlinks to adequate .asd files.
Of course, symlinks are both not so portable
(not very well supported in Windows, if at all, though
later versions of ASDF recognize Windows shortcuts as a substitute),
and a maintenance nightmare (unlike Macintosh aliases,
they are not automatically updated
when a system is installed, moved or deleted).
Therefore link farms were not a good interface to end-users.

For ASDF 2, I implemented a new way to manage system files:
the source-registry. It allows you to specify trees to search recursively
as well as single directories. Whether the registry is searched lazily
(when a system is requested) or eagerly (one the first time around) or
yet something else, was left unspecified. In my first implementation of it,
I built upon the previous central-registry, and at initialization time
I was looking for which subdirectories in the specified trees
did contain .asd files, then saved these in a list that I consulted
in the very same way as the central-registry.
(Note: the central registry is still there, and consulted first.)

However, DIRECTORY returns fully resolved TRUENAMEs, and
any portable way of searching inside subdirectories is doomed
to squash away logical pathnames. Currently, any use of :tree
in source-registry will resolve into physical pathnames (PPNs)
on most implementations, and always did.

However, entries with :directory used to preserve LPNs.
Not so with 2.014.7, where, prompted by a Quicklisp feature request,
I implemented eager caching of ASDF systems. So I query DIRECTORY
and save the results in a hash-table indexed by pathname-name.
The table is case-sensitive, to reflect the case-sensitive names of ASDF.
One consequence is that :directory entries do not preserve LPNs anymore.

Another more subtle consequence is that things may go wrong
if your filesystem is case-insensitive but case-preserving.
In the old way of doing things (still available with the central-registry),
you'd typically be looking for the file with lower-case name,
and you would find it even if named in different cases on disk.
With the old way, you'll record the name in its case mix,
and will never match it against the (lowercase) name.
So some systems that depend on this behavior will disappear.


POTENTIAL SOLUTIONS

An easy way out is to declare that ASDF from now on
shall be case-insensitive, using EQUALP to compare system names
and in hash-tables that cache system files.
Another way out is to declare that ASDF is still case sensitive,
and that the only case allowed is lower-case for physical filenames.
Any system that doesn't work with this new interpretation
probably didn't work portably with the old one, anyway.

A harder way out would be to assume case-sensitivity in the filesystem
(since some filesystems are case-sensitive, and any asd file that depends
on case-insensitivity is non-portable and deserves to lose),
and make extra effort when using DIRECTORY against logical pathnames
to always reconstitute LPNs from the PPNs given by directory,
and drop the reconstituted results if they don't resolve to the same thing.
But is it worth the pain?

Finally, I could do nothing, and assume that
people who go through the pain of setting up logical pathnames
before they run ASDF can just as well setup the *central-registry*
and have LPNs be recognized in the good old way.
If you use the source-registry, your LPNs will be squashed,
and nothing will break in the code since code works fine with PPNs,
just like most everyone use it, but your debug information
will be stored according to the PPN, not the LPN. Meh.

Zach (if you're still reading), do any systems in the wild
use anything but lower-case names for system files?

[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
Two members of the Political Police salute a giant portrait of our Great Leader
- Comrade, do you think the same as I do? inquires the first one, with
a sincere gaze pleeing into his colleagues' eyes.
- Comrade, the second one replies and returns the same gaze: yes, I do.
- In that case, comrade, you're under arrest! concludes the first one.




More information about the asdf-devel mailing list