[OpenAFS] dot path times out after root.cell was moved to a different fileserver
Marc Schmitt
mschmitt@inf.ethz.ch
Tue, 04 Nov 2003 14:19:23 +0100
Hi all,
We're having problems with our Linux clients after the root.cell volume
was moved to a different fileserver and, on the same token, that
fileserver became a DB server only.
Last Sunday, the root.cell volume was moved from server X to server Y.
Then server X was removed from the list of fileservers and is DB server
only now. Today, I happened to access the dot path on several Linux
clients (RedHat 7.3, 2.4.20-20.7, OpenAFS 1.2.10) and see:
ls: /afs/.ethz.ch: Connection timed out
On the console of the clients, I get then:
afs: Lost contact with file server X in cell ethz.ch (all multi-homed ip
addresses down for the server)
Somehow, the Linux clients still expect root.cell to be on fileserver X,
which looks like a client bug. Doing a reboot of the clients solves the
problem, but... Interestingly, I do not see this problem on our Solaris
machines (running OpenAFS and TransarcAFS).
Is there a way to "purge" X from the clients' fileserver list w/o having
to reboot them? I tried to restart afsd, but it the kernel module
appeared to be busy, the service could not be stopped (maybe because it
waits for X to come back?).
TIA
Marc