[OpenAFS] connection timeouts

Juha Jäykkä juolja@utu.fi
Mon, 15 May 2006 18:11:16 +0300

Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable


I just hit a quite strange phenomenon on an openafs client (linux, version
1.3.81). Everything works fine for a couple of months, then suddenly

kernel: afs: Lost contact with file server in cell
tfy.utu.fi (all multi-homed ip addresses down for the server)

Usually, this corrects itself soon enough with a corresponding "is back
up" message. This time, it never did. No matter what I did. The client
thought the server is down until I restarted the whole client.

The network was definitely never broken, there are no firewalls in
between and no other client in the cell thought the server was lost.

I also cannot reproduce this, but this is still somewhat disturbing since
our $HOME's are on AFS and if a client simply loses the connection to a
fileserver serving the $HOME, the user(s) of the machine are unable to
get much of anything done - let alone save their work. Restarting the
client need closing all files open on /afs, so that will destroy any
unsaved changes the user has made, which is not desirable.

Any ideas how to a) reproduce this, b) prevent it from happening again or
c) fix this next time it happens without restarting the client?


                | Juha J=C3=A4ykk=C3=A4, juolja@utu.fi			|
		| Laboratory of Theoretical Physics		|
		| Department of Physics, University of Turku	|
                | home: http://www.utu.fi/~juolja/              |

Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

Version: GnuPG v1.4.3 (GNU/Linux)