[OpenAFS] dot path times out after root.cell was moved to a different
fileserver
Marc Schmitt
mschmitt@inf.ethz.ch
Tue, 04 Nov 2003 15:08:56 +0100
Hi Horst,
Horst Birthelmer wrote:
>Hi,
>
>AFAIK you cannot remove the AFS kernel module on Linux. it always crashed
>on my machines.
>
Hmm, I have not seen that problem for a long time. It used to happen
regularly about two years ago, when OpenAFS was just born.
>
>I think your problem is some inconsistency of the VLDB.
>
We did check again, the VLDB appears to be consistent.
I've just received a call from one of our AFS admins, he said that he
had found a Solaris machine, running OpenAFS, that showed the same
problem as I'm experiencing. He suggested issuing 'fs checkvol' on the
clients and it worked, I could access the dot path after that.
Looks like the following happens:
- client accessed dot path before Sunday -> client caches volume origine
to be fileserver X
- root.cell was moved to Y on Sunday
- clients that had accessed root.cell before Sunday time out on dot path
after Sunday
- clients that have not accessed root.cell since they have been booted
can access root.cell, they do not know about X having been a fileserver
- 'fs checkvol' helps clients that are stuck to get access to the dot
path again
What I don't understand is why this didn't work "out of the box". Moving
volumes from one server to another is not reallly an atypcal AFS
operation. :)
Or is root.cell special in that sense?
Marc