[OpenAFS] Write-through/delayed-write support in AFS?

Mon, 18 Feb 2002 22:20:36 -0500

Sandeep Gopal Nijsure <nijsure@cs.unt.edu> writes:
...
> 2. What if the users of the network use write file operations in their
> programs, and forget to close the files before exiting? Just like the
> OS closes the local files left open by the application, does it close
> the AFS files?
...

In Unix (and Linux), the kernel automatically closes any descriptors
that are still open when the process exits.  This is necessary because
if this didn't happen, the resources claimed by descriptors would not
be reclaimed, and eventually the system would run out of file
descriptors.  This in turn means this is very reliable behavior; it is
unlikely you'd find anything like Unix/Linux that did not have this
behavior, and if you did, it's likely that whatever you found would be
undesirable for many other reasons.  File descriptors are actually the
easy case for AFS; where things get difficult is things like text
segments, mmap'd files, and core dumps.  These don't have file
descriptors, and different OS versions can handle things *very*
differently.  Even here AFS tries to guarantee the correct behavior -
and a failure to properly flush writes out to the file server in a
timely and predictable fashion would be regarded as a serious bug.

The main implication of "write upon close" in AFS is that file write
errors get reported differently.  On a local disk, write errors such as
running out of space can generally be detected during the "write"
call.  In AFS, such an error may not be detectable until "close".  So,
it is much more useful (and important) to report errors from "close".
Whether a given random application actually does this can only be determined
by experimentation, or less accurately, by inspection of the code.
Note that there are oodles of versions of "vi" at this point, all of
which may behave differently.  I think Linux, in particular, has 4 or 5
distinctly different flavours of "vi".  At a first glance, "xfig"
checks for errors when saving a document, but not when printing.

On the other hand, even with a local file system, problems with the
physical disk media aren't necessarily reported to the application at all (due
to the delayed write semantics of UFS.)  Also, with both UFS and AFS,
I/O errors *are* usually reported to the user via uprintf or its
analogue, essentially a write to the user's terminal directly from the
kernel.  This is more useful for interactive sessions than it would be
for tasks run by daemons.  With AFS, it is also possible to use "fsync"
to flush the AFS disk buffers before close, and to detect errors this
way.  It is not necessarily obvious that there is anything useful that
can be done with the file descriptor if the flush fails.

One useful feature of AFS missing from UFS is the backup volume.
This permits users to recover yesterday's copy of files, which is
often useful in the case of application crashes and other surprises.
This, plus basic user education ('save files often', 'use rcs or cvs for
important files', etc.) should be sufficient for most users.

For automatic processes, when updating mission critical files out
in AFS, it's worth thinking through the failure modes very carefully.
It may make sense to update a local copy of the file, then copy it
out to AFS, then close the file & compare (or checksum) the two to see
if the file made it out there intact.  In some cases, it may make
sense to use replicated volumes.  It may help to keep in mind that
AFS is just another network application; like any other application,
if you *need* commit/abort transaction semantics, you have to keep
a local redo file and be prepared to recover the transaction later
when the network or server comes back up.

				-Marcus