[armedbear-devel] #P serialization across Windows/non-Windows

Alessio Stalla alessiostalla at gmail.com
Fri Jun 18 11:00:05 UTC 2010


On Fri, Jun 18, 2010 at 12:37 PM, Mark Evenson <evenson at panix.com> wrote:
> On 6/17/10 7:38 PM, Alessio Stalla wrote:
> […]
>>
>> Proposed solution: let's invent an ABCL-specific way to print
>> arbitrary pathnames. I proposed #P"abcl:(make-pathname ...)" which is
>> ANSI-compatible and similar enough to what the current code in
>> Pathname.writeToString can produce. Let's use that to print
>> pathnames[1]. Reading them back is as simple as (eval
>> (read-from-string ...)), and no other code needs to be modified.
>>
>> What do you think?
>>
>> [1] if not always, at least with dump-form, and when *print-readably*
>> is T and the namestring can't be used. In the latter case currently
>> the #P(...) syntax is used. Btw, probably dump-form *should* bind
>> *print-readably* to T...
>
> Hmmm, not sure I totally like if your proposal is that a PATHNAME is always
> in the #P"abcl:(make-pathname …)".  If you are just proposing this is used
> when a namestring can't be produced, then I support this (weakly).

No, this is precisely the problem we have today. On Windows, for
#P"/foo/bar" a namestring *can* be produced, but it is "\foo\bar",
which then can't be read back in on non-windows.

> I would
> claim that users are pretty cognitively wired at this point to expect a path
> be a string containing directory separators that we want to obey the
> "principle of least surprise" here.

In fact, I'm in doubt whether to always use the ugly form or only when
serializing to fasls and when today we would have used #P(...)

> Your proposal still doesn't say what ABCL on non-Windows should do with a
> deserialized PATHNAME that represents a UNC path or has a drive letter in
> DEVICE.

What could it possibly do but fail if you try to OPEN the pathname?
The problem is not Windows pathnames on non-Windows. Those cannot work
and it's the user's responsibility to use them only if she knows that
the software will only run on Windows. The problem are Unix pathnames
that, when printed under Windows, have their slashes converted to
backslashes.

> The current code (as I understand it) on non-Windows would treat a UNC
> pathname encoded as a string , e.g. "\\a\b\c\d",  as an error as the '\c'
> doesn't represent a valid Java char escape sequence, although a case like
> "\\n\n\n\baz" would name the file with char sequence ('\' LF LF BS 'a' 'z').

That's the same under Windows. String escaping has nothing to do with pathnames.

> Counter-proposal #1: Note that java.io.File *does* correctly accept "/" as a
> directory separator under Windows.  So, we could potentially just declare
> "/" as our directory separator in the #P representation on all platforms.  I
> would then make UNC pathnames not have a printable namestring, so
> inadvertent doubling of path separators doesn't cause confusion.  For UNC
> and drive letters on non-Windows, I would signal a condition when a
> de-reference was attempted with a restart that tried to DWIM (ignore the UNC
> share, drop the drive letter reference, or allow a user supplied
> correction).

I wouldn't do this DWIM thing. A UNC or drive-letter pathname on
non-windows is a user error, imho. As for always using / as separator,
it would solve the problem with asdf, but it wouldn't solve the other
problem that not all pathnames are currently printable by abcl as
#P"..." and so it sometimes uses #P(...) to list their components,
which is not ANSI-compatible.

> Counter-proposal #2: Use URI (IRI?) for the namestring representation. This
> fits better to the nature of ABCL pathnames i.e. they aren't really just
> about filesystems at this point.  '/' would again be the standard directory
> path separator.  UNCs would get their own scheme (or we would enforce RFC
> 3986 so that 'file://server/share/dir/file' means UNC whereas
> 'file:///not/a/server/but/absolute' means an absolute pathname).  Drive
> letters would be part of the URI authority ('file://c:/windows/path').  The
> same sort of DWIM condition/restarts for Windows-specific semantic on
> non-Windows would be available.  As a user convenience, we might make the
> 'file://' prefix be optionally inferred.

The problem with URIs is that they cannot represent all of CL's
pathname components (version?). We'd need to invent an encoding. My
proposal is simpler because an encoding is basically already there.
However, I'm not against using URIs with an appropriate encoding.

Bye,
Alessio




More information about the armedbear-devel mailing list