[OpenAFS-devel] The ihandle sync thing

Garrett Wollman wollman@csail.mit.edu
Thu, 28 Mar 2013 13:47:18 -0400


<<On Thu, 28 Mar 2013 09:49:45 -0700, Russ Allbery <rra@stanford.edu> said:

> At least on Linux, fsync tries to flush all the way to media with the
> assistance of the kernel, even to devices with write caches.

POSIX specifies the function of fsync():

	The fsync ( ) function shall request that all data for the
	open file descriptor named by fildes is to be transferred to
	the storage device associated with the file described by
	fildes in an implementation-defined manner. The fsync ( )
	function shall not return until the system has completed that
	action or until an error is detected.

	If _POSIX_SYNCHRONIZED_IO is defined, the fsync ( ) function
	shall force all currently queued I/O operations associated
	with the file indicated by file descriptor fildes to the
	synchronized I/O completion state. All I/O operations shall be
	completed as defined for synchronized I/O file integrity
	completion.

However, the rationale has this to say:

	The fsync ( ) function is intended to force a physical write
	of data from the buffer cache, and to assure that after a
	system crash or other failure that all data up to the time of
	the fsync ( ) call is recorded on the disk. Since the concepts
	of ``buffer cache'', ``system crash'', ``physical write'', and
	``non-volatile storage'' are not defined here, the wording has
	to be more abstract.

	If _POSIX_SYNCHRONIZED_IO is not defined, the wording relies
	heavily on the conformance document to tell the user what can
	be expected from the system. It is explicitly intended that a
	null implementation is permitted. This could be valid in the
	case where the system cannot assure non-volatile storage under
	any circumstances or when the system is highly fault-tolerant
	and the functionality is not required. In the middle ground
	between these extremes, fsync ( ) might or might not actually
	cause data to be written where it is safe from a power
	failure. The conformance document should identify at least
	that one configuration exists (and how to obtain that
	configuration) where this can be assured for at least some
	files that the user can select to use for critical data. It is
	not intended that an exhaustive list is required, but rather
	sufficient information is provided so that if critical data
	needs to be saved, the user can determine how the system is to
	be configured to allow the data to be written to non-volatile
	storage.

(Quoting from the now obsolete IEEE Std. 1003.1-2001 text, but I do
not believe it has changed in the current version.)

I don't think that even with _POSIX_SYNCHRONIZED_IO the requirements
are tight enough to be truly useful -- just in the past week there has
been a discussion in the Austin Group about whether the Standard
actually requires that fsync() on a newly-created file actually ensure
that the directory entries for that file be committed to storage.
(The definition of "synchronized I/O file integrity completion" can be
read both ways.)

-GAWollman