[armedbear-devel] #P serialization across Windows/non-Windows

Mark Evenson evenson at panix.com
Mon Jun 21 08:03:44 UTC 2010


On 6/18/10 1:00 PM, Alessio Stalla wrote:
> On Fri, Jun 18, 2010 at 12:37 PM, Mark Evenson<evenson at panix.com>  wrote:
>> On 6/17/10 7:38 PM, Alessio Stalla wrote:

[…]

>> Hmmm, not sure I totally like if your proposal is that a PATHNAME is always
>> in the #P"abcl:(make-pathname …)".  If you are just proposing this is used
>> when a namestring can't be produced, then I support this (weakly).
>
> No, this is precisely the problem we have today. On Windows, for
> #P"/foo/bar" a namestring *can* be produced, but it is "\foo\bar",
> which then can't be read back in on non-windows.

But since we control the production of the namestring, I am proposing we 
only ever consider a string with '/' as the directory path separator. 
'\' is banished.  We provide a convenience function to convert PATHNAME 
to a Windows "native" representation.

[…]

>> Your proposal still doesn't say what ABCL on non-Windows should do with a
>> deserialized PATHNAME that represents a UNC path or has a drive letter in
>> DEVICE.
>
> What could it possibly do but fail if you try to OPEN the pathname?

It's a bit more subtle than that:  Pathname.java contains runtime 
conditionals for win/non-win platform meaning that constructing/parsing 
PATHNAMEs work differently on each platform.

I am a little worried about MERGE-PATHNAME-DEFAULTS without a default 
DEVICE inserting the current drive letter for DEVICE under Windows. 
Maybe this isn't a big issue.


> The problem is not Windows pathnames on non-Windows. Those cannot work
> and it's the user's responsibility to use them only if she knows that
> the software will only run on Windows. The problem are Unix pathnames
> that, when printed under Windows, have their slashes converted to
> backslashes.
>
>> The current code (as I understand it) on non-Windows would treat a UNC
>> pathname encoded as a string , e.g. "\\a\b\c\d",  as an error as the '\c'
>> doesn't represent a valid Java char escape sequence, although a case like
>> "\\n\n\n\baz" would name the file with char sequence ('\' LF LF BS 'a' 'z').
>
> That's the same under Windows. String escaping has nothing to do with pathnames.

No it isn't the same, as under Windows, Pathname.init(String) will 
convert this to a PATHNAME that uses HOST to store the UNC server and 
share name.  Under non-Windows, the eventual call to File(String) will 
have to contend with what looks to Java like backslash escapes as 
"\\a\b\c\d" is stored in the NAME field.

>> Counter-proposal #1: Note that java.io.File *does* correctly accept "/" as a
>> directory separator under Windows.  So, we could potentially just declare
>> "/" as our directory separator in the #P representation on all platforms.  I
>> would then make UNC pathnames not have a printable namestring, so
>> inadvertent doubling of path separators doesn't cause confusion.  For UNC
>> and drive letters on non-Windows, I would signal a condition when a
>> de-reference was attempted with a restart that tried to DWIM (ignore the UNC
>> share, drop the drive letter reference, or allow a user supplied
>> correction).
>
> I wouldn't do this DWIM thing. A UNC or drive-letter pathname on
> non-windows is a user error, imho. As for always using / as separator,
> it would solve the problem with asdf, but it wouldn't solve the other
> problem that not all pathnames are currently printable by abcl as
> #P"..." and so it sometimes uses #P(...) to list their components,
> which is not ANSI-compatible.

It's a user error certainly, but if it is a correctable situation 
providing a restart is a pretty decent way to go.  Such a user error may 
easily be occurring in an ASDF package that would otherwise work fine 
under ABCL, which I would like to allow.

That not all pathnames are printable should be corrected in an 
ANSI-compatible manner as you describe.  This is a orthogonal to the 
backslash/forward slash problems, right?  This apparently confused me 
until your reply here.

>> Counter-proposal #2: Use URI (IRI?) for the namestring representation. This
>> fits better to the nature of ABCL pathnames i.e. they aren't really just
>> about filesystems at this point.  '/' would again be the standard directory
>> path separator.  UNCs would get their own scheme (or we would enforce RFC
>> 3986 so that 'file://server/share/dir/file' means UNC whereas
>> 'file:///not/a/server/but/absolute' means an absolute pathname).  Drive
>> letters would be part of the URI authority ('file://c:/windows/path').  The
>> same sort of DWIM condition/restarts for Windows-specific semantic on
>> non-Windows would be available.  As a user convenience, we might make the
>> 'file://' prefix be optionally inferred.
>
> The problem with URIs is that they cannot represent all of CL's
> pathname components (version?). We'd need to invent an encoding. My
> proposal is simpler because an encoding is basically already there.
> However, I'm not against using URIs with an appropriate encoding.

I amend my second proposal to "all Pathnames that have a straightforward 
URI representation use it; others use the ANSI compatible mechanism 
proposed by Alessio".

But don't consider #2 for right now:  think about #1:  "no use of '\' as 
a directory separator in ABCL namestrings".

-- 
"A screaming comes across the sky.  It has happened before, but there
is nothing to compare to it now."




More information about the armedbear-devel mailing list