[OpenAFS-devel] N800, Mobile AFS, and changing IP addresses

Jeffrey Hutzelman jhutz@cmu.edu
Mon, 28 Apr 2008 18:23:48 -0400


--On Sunday, April 27, 2008 09:45:34 AM -0400 Jeffrey Altman 
<jaltman@secure-endpoints.com> wrote:

> Simon Wilkinson wrote:


>> Tune the AFS timeouts so that the user get's failure messages more
>> quickly?

If by "tune the AFS timeouts" you mean "make them smaller", don't do that. 
It is tempting to lower timeouts to reduce the amount of time you have to 
wait when something isn't working, assuming that when it is working you'll 
get a response quickly because it seems that way every time you look.  But 
most people haven't seen a wide enough variety of networks to make this 
generalization; a "short" timeout that works for you may not work so well 
for someone with a high-latency path or a heavily-loaded server, and 
setting timeouts a bit too short means more and more frequent 
retransmissions, which can make a bad situation (heavy congestion) worse.


> Performing the check at the time the network configuration changed will
> ensure that the cache manager knows as soon as possible whether or not
> the desired servers are in fact accessible.   If they are not the cache
> manager can fail requests immediately.

Performing the check at the time the network configuration changes will 
guarantee that the cache manager talks to a large number of servers 
(normally, every server it has ever had a reference to since starting) many 
of which the user may not care about.

It seems the simplest thing to do here is whenever an interface is brought 
up to reset the state of all servers to "up".  This results in a fast 
response time for a server that really is up, at the expense of an extra 
timeout in cases where the user touches a server that is down before the CM 
gets around to doing the next 5 minute check.

-- Jeff