[AFS3-std] Standardization of GetCapabilties RPCs for AFS3 client and services

Jeffrey Hutzelman jhutz@cmu.edu
Tue, 28 Feb 2006 16:31:26 -0500



On Tuesday, February 28, 2006 04:00:02 PM -0500 Ken Hornstein 
<kenh@cmf.nrl.navy.mil> wrote:

> - Implementing the automatic fallback at the RX layer is ... well,
>   complicated.  It just makes the code even more of a mess than it is
>   now, and presents a number of other problems.  Like ... should we be
>   doing a blocking connect() and wait for the complete TCP timeout?
>   Should we do something shorter?

That's a good question.  In a properly functioning network, a TCP timeout 
will likely mean we can't get traffic to/from that host at all, and we 
should just return a timeout to the application after a full TCP timeout. 
The realities of broken firewalls mean we probably need to try falling back 
to a UDP connection first.  And of course, this doesn't happen when the 
connection is created but when the application tries to make the first 
call, or possibly a later call.  So we may need to do this dance on each 
call, at least until the first time we hear something from the server.




>   If we want to completely rewrite RX
>   so multiple transports can be done more cleanly, then of course life
>   is simpler ...  but I don't see anyone stepping up to do that work,
>   and _I_ am certainly not volunteering.

Me neither, though it would help a lot.  We should at least try to think of 
the protocol spec in a modular way, though, so as to avoid imposing the 
broken non-modular architecture on people (David) doing new 
implementations.

> - If we get ECONNREFUSED, should we try UDP, or return an error message
>   back to the application?  I can think of difference cases where each one
>   of those responses are the "right" answer.  What happens if we get a
> timeout?   Should we revert to UDP always?  Some of the time?  Never?

If we get back ECONNREFUSED, we note that the server doesn't do TCP, and 
fall back to UDP.  If we get a timeout, see above - we can try falling back 
to UDP, or we can just return a timeout.

In any case, once we either successfully estabish a TCP connection or 
determine that the server doesn't support TCP, we mark the connection as 
using a particular transport and never again try any other transport.

We could also provide an API that allows an application to specify use of a 
particular transport (with no fallback) or auto-detection using a fallback 
algorithm.  I'd expect most applications not to select TCP transport, and 
in fact the old API's should default to UDP-only.

-- Jeff