[OpenAFS-devel] Potential write-consistency race

David Howells dhowells@cambridge.redhat.com
Thu, 10 Oct 2002 11:35:56 +0100


Hi Derrick,

> Since I'm not parsing the dates usefully, did you propose this before or
> after:
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0210.0/1417.html

After.

> > Furthermore, a similar problem occurs if the client tries to "append" to a
> > file. The point at which it appends may no longer be the end of the file.
> 
> You mean client in this case as "user of the filesystem" and not "the part
> that talks to the server" correct?

I mean the bit that talks to the server. As long as it has to nominate a new
end of file, there's a potential race with another client doing the same
thing.

> IBM probably won't do it.

I suspect you might be correct.

> Also, it would be hard to work AppendData into our client without doing
> something like bypassing the cache on the write and then fetching the data
> on callback, so client-side at least we probably wouldn't do it, at least
> not soon.

Yeah... The append can be a little tricky. Though you could append to the
server and write onto the end of the cache. However, having looked at OpenAFS,
I'm not sure how you could do it. It's just that O_APPEND semantics can't be
supported properly without it (two clients both working with a file in
O_APPEND mode may actually cause data going to the server to overwrite each
other).

> so basically you want a way to bypass potential truncation, correct?

Yes.

> > I think these should be very easy to add to the fileservers (they just
> > have to overload the StoreData() operatiosn), and a little harder to add
> > to the client.
> 
> Server-wise, I think I agree. Client, see above.

I suppose you're right. Though the insertion operation would be easier to make
use of in the client. The client could just use InsertData() instead of
StoreData() in normal write management (simple change) and the truncate call
could be modified to instruct the server to shorten the file in addition to
the things it currently does.

> Realize, though, that unless you're going forward with the "push writes as
> fast as you can" theory this happens not particularly frequently,

Yet it can happen. It seems that OpenAFS just ignores the fact (the AFS
protocol doesn't really allow it to do anything else).

However, you and Jan may well be correct in your assertion that it doesn't
happen often enough to worry about, and if the user's realise this and don't
like it (maybe losing data), then tough.

Furthermore, the reason for the push every write direct to the server was to
avoid three problems:

 (*) dealing with cached writes over a reboot.
 (*) dealing with unnotified access changes (group memberships changing).
 (*) dealing with unsent modifications in the cache being deprecated.

I think, then, I have to take a policy of:

 (1) Open a file with O_SYNC (or if there's no cache at all) then every write
     to that file gets pushed direct to the server and written into the cache
     (if there is one).

 (2) Open a file without O_SYNC and writes get stuffed into the cache and
     written back only on close or sync with the credentials of the whoever
     does that operation. Making space in the cache for some other file can
     then be tricky since you can't send dirty pages to the server without
     a set credentials to use.

I don't much like (2) as there's no way to tell whether a user has got write
permission short of trying StoreData().

> and I think Jan has explained well why at least at this point the world
> isn't ready for that.

I don't think he explained _why_ it's necessary. He just stated he thought
that that's what ought to be done (and that that's how it's done in Coda).

> I should also mention that perhaps linux-kernel isn't the best place to
> discuss AFS3 protocol. The best place, sadly, hasn't been well-advertised or
> particularly active. That would be afs3-protocol@stacken.kth.se (subscribe
> by mailing afs3-protocol-request@stacken.kth.se).

Okay. Is there an archive?

> I may have missed a reply, but I should also note that OpenAFS doesn't have
> any krb5 code in the kernel, nor real krb4; Since I know one of the messages
> on linux-kernel implied or stated we did, I thought I should point that out.

I didn't say it did. Jan suggested that we might need kerberos in the kernel
and I pointed out that OpenAFS, if it had any, would have a minimal amount.

The Linux NFSv4 person confirmed that NFSv4 is going to require Kerberos
support though.

David