[AFS3-std] Re: [OpenAFS-devel] convergence of RxOSD, Extended Call Backs, Byte Range Locking, etc.

Wed, 29 Jul 2009 10:18:26 +0200

Jeffrey Altman wrote:
> Hartmut Reuter wrote:
> 
>> Jeffrey:
>> Is that the kind of specification you are looking for? Or what else?
> 
> I would like a specification of the entire protocol such that it is
> sufficient for me to be able to implement RXOSD support based solely on
> the specification.  The specification must include not only the protocol
> messages but also a description of the required semantics.  Only with
> such a specification will it be possible for independent developers to
> implement RXOSD.  More importantly, only with such a specification is it
> possible for analysis to be done of the protocol to identify areas where
> it alters the existing AFS protocol semantics, or conflicts with other
> work that is in progress.

Can you give me a link to the specification for the current protocol
that I can see how such a specification should look like?

> 
> The concerns that Tom Keiser raised are serious.  One of them will be
> addressed via the use of Start/Extend/End messages for Fetch and Store
> operations.  However, there are still open questions.  What is the
> behavior when a client dies after sending a Start Store?  Is there a
> rollback mechanism that the file server must use to restore the
> pre-existing data version?  Does the file server assume that the data
> was written and increment the data version?  These are decisions that
> will need to be made and I'm sure that there is not going to be
> consensus on what the answer should be.

Also in a normal store from CM to fileserver either the client can die
or the network fall down during the RPC. The fileserver then gives the
error code -32 and does a stat on the file to find out the actual length
to put into the vnode. However, store is not a transaction which you can
undo restoring the previous data version because it may have been
partially overwritten.

The same is true with OSD. If the client dies the fileserver will miss
after a while the EndAsyncStore RPC and has to decide on its own what to
do. He could ask all OSDs where objects of the file are situated to
calculate the actual length of the file and store this length in the
vnode. This is also what "vos salvage" does to verify the existence,
link counts and correct size of files. Of course, the fileserver has to
increment the data version.

If not the client was the problem, but an OSD went down, the client can
inform the fileserver about the actual length of the file he still was
able to store.

But anyway, in both cases the new data version of the file is not
consistent from the point of view of the application.

> 
> Also, any discussion regarding revisions to the afs3 protocol must take
> place on afs3-standardization@openafs.org.  The
> openafs-devel@openafs.org list is not read by the Arla and kAFS
> developers.  You are implementing RXOSD for OpenAFS Unix CMs and
> servers.  However, we must be inclusive of all AFS implementors in any
> protocol discussions.

That's right.

Thank you,
Hartmut
> 
> Thank you.
> 
> Jeffrey Altman
> 
> 
> 
> 
> 

-- 
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------