[Ecls-list] open :supersede and rename-file advise

Richard M Kreuter kreuter at progn.net
Tue Nov 27 01:36:32 UTC 2007


"Geo Carncross" writes:
> 
> Under unix, robust applications tend to write to new files in the
> following way:
> 
>   fd = open(tempfile, O_EXCL|O_CREAT, 0600);
>   ... write to fd ...
>   if (fsync(fd) == -1) goto fail;
>   if (close(fd) == -1) goto fail; /* NFS can fail here */
>   if (rename(tempfile, realfile) == -1) goto fail;
> 
> This has the benefit of having no points where "realfile" has
> incomplete or invalid data.
> If I want to generate an error if "realname" exists with still having
> that benefit, I do the rename-step as:
> 
>   if (link(tempfile, realfile) == -1) goto fail;
>   (void)unlink(tempfile);
>   if (stat(realfile,&sb) == -1) goto fail;
>   if (sb.st_size != expected_size) goto fail;
> 
>... in checking clhs it seems that (open :supersede) is allowed to have
> these semantics.

(I recently worked through this issue when looking at whether it would
be possible to add write-beside opening in SBCL.  I think it's a case
where any choice is going to be problematic for somebody, unfortunately.
I offer this analysis in the hope that it's useful for discussion.)

So, <putting on my language lawyer hat>, several OPEN :IF-EXISTS actions
are permitted to have semantics along these lines.  Indeed, a strict
reading of the dictionary entry for OPEN might entail that :NEW-VERSION,
:RENAME, :RENAME-AND-DELETE, and :SUPERSEDE are required /not/ to open
the existing file, if you interpret the language "a new file is created"
in the dictionary entry for OPEN to mean that a file with a distinct
inode must be allocated on Unix file systems. (Note that :OVERWRITE and
:APPEND have uncontroversial semantics, since they expressly
destructively modify the existing file.)

But <taking off my language lawyer hat> nobody implements those actions
that way: this interpretation leaves the user without any standard
:IF-EXISTS action that gives you Unixy "opening for writing truncates
the existing file" semantics.  And since there's no standard function
for truncating files or file-streams, you can't standardly get this
behavior by composition with some other :IF-EXISTS action.

Now, <putting the hat back on> insofar as Common Lisp was designed to be
a language for writing portable applications, that's not a big loss:
portable applications shouldn't rely on concrete file system semantics.

But <taking the hat off> Common Lisp implementations might also have
goals such as helping users familiar with their platform to write
programs that interoperate well with other programs on the platform.  In
particular, there are a couple of things that break when programs open
"write beside" files in this way:

[a] rename(2) breaks hard links on Unix.  Hard links aren't terribly
    important nowadays, but if interoperability with other programs on
    the platform is a goal, you want there to be some :IF-EXISTS action
    that maps straightforwardly to a mode where open(2) destroys file
    contents, but preserves file identity.

[b] Having an OPEN that opens a determinate file allows one program to
    monitor the output of another program while the writer is writing,
    without the monitoring program knowing the dynamic file name
    generated by the writing program.  For instance, if you know that
    program A generates output file "foo", you can tail(1) "foo"; but if
    program A generates a randomly selected output file, then it becomes
    arbitrarily hard to monitor A's output before A closes the stream.

(Note that hard links and concurrent file readers are both, in
principle, platform-specific features, so portable applications
shouldn't really rely on them.)

However, <putting on my Lisp user hat> users might reasonably want CL
implementations to be upwardly compatible with themselves.  In the
present case, if an implementation has historically provided :SUPERSEDE
as Unix's truncating open, then programmers who use that implementation
might understandably (albeit non-portably) take advantage of
platform-dependent features such as hard links and concurrent file
readers.

Finally there's the fact that most implementations do (OPEN
... :IF-EXISTS :SUPERSEDE) as a truncating-open.  When implementations
agree on a detail, users who want their programs to run on many Lisps
have fewer gotchas (though in the case of :SUPERSEDE, applications that
rely on the details here are only portable by accident).  Convergence
among implementors, even when they're all wrong, is worth something.

So let's take stock of what seems to influence implementation choices:

(1) the apparent meaning of the ANSI CL standard,

(2) the desires of some users to be able to take advantage of
    platform-specific file system features (such as hard links and
    concurrent readers),

(3) the desires of some users for the implementation to provide
    high-level abstractions and convenient features (such as atomic file
    commits),

(4) the desire of some users for their existing systems to run with
    minimal modification on future versions of the implementations where
    they already run,

(5) the desire of some users for their existing systems to run
    with minimal modification on implementations where they may not yet
    run,

(6) the desires of some implementors to keep their implementations
    simple, maintainable and/or debugged.

How do you rank these sets of desires?  Assuming you were starting a new
Common Lisp implementation, it seems to me that implementing :SUPERSEDE
as Unix's truncating-open places (2), (5) and (6) above (3) and (1).
But given an implementation that already implements :SUPERSEDE as
truncating-open, changing :SUPERSEDE to a write-beside-open puts (3) and
(1) above (2), (4), (5) and (6).  So I think the conclusions you come to
depend mostly on how you prioritize things.

(I've tried so far to keep my opinion out of this analysis, but I
personally believe that the phrase "a new file is created" in the
dictionary entry for OPEN should be read to mean that implementations
that destructively modify the existing file are in fact non-conforming
on this point.  At any rate, those implementations that do the obvious
things for :RENAME and :RENAME-AND-DELETE are not consistent with
themselves in creating a new file in those cases, but not for :SUPERSEDE
or :NEW-VERSION.  IMO that the truly Right Thing is for implementors to
add a new :IF-EXISTS action, :TRUNCATE, which, if given, means to open
the existing file and truncate it, and to have :SUPERSEDE, :NEW-VERSION,
:RENAME, :RENAME-AND-DELETE all create a new file in some manner; I'd
prefer write-beside openings with rename-at-close-time behavior for all
those cases, too.  But I recognize that any implementation that changes
its implementation of the :IF-EXISTS actions will break compatibility
with itself, which is likely to frustrate its users, so someone ends up
losing in any case.)

Regards,
RmK




More information about the ecl-devel mailing list