[OpenAFS-devel] Suboptimal behavior of dynroot and (local) network outages

Derek Atkins warlord@MIT.EDU
30 Jul 2002 11:15:17 -0400


When running AFS with a dynroot, the timeout behavior of AFS is
suboptimal in the face of complete network lossage.  This is
particularly true if you have multiple AFS cells in your PATH.

Without dynroot you only needed to timeout /afs.  Now, you have to
timeout every cell individually.  This behavior is making my laptop
much worse with -dynroot than it used to be without -dynroot.  It lets
me boot without network but then hangs the system MUCH longer if the
network goes away. :(

AFS needs a way to notice that there is no network locally and timeout
quickly.  On Linux (at least), we should be able to use the ICMP Host
Unreachable messages to timeout servers _quickly_.  If we receive N
ICMP messages in a row (without an actual real message) then we
timeout the server and move on.  This will speed up timeouts in the
face of a real connectivity loss.  It will not help in the face of
packets being dropped (e.g. congestion), but that's ok -- you don't
want to timeout in those situations.

Suggestions?  Comments?

-derek
-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available