[OpenAFS] Problems on AFS Unix clients after AFS fileserver moves
Dexter 'Kim' Kimball
dhk@ccre.com
Tue, 9 Aug 2005 14:59:02 -0600
fs checkv will cause the client to discard what it remembers about volumes.
Did you try that?
Kim
-----Original Message-----
From: openafs-info-admin@openafs.org
[mailto:openafs-info-admin@openafs.org] On Behalf Of Rich Sudlow
Sent: Tuesday, August 09, 2005 9:58 AM
To: openafs
Subject: [OpenAFS] Problems on AFS Unix clients after AFS
fileserver moves
We've been having problems with our cell for the last couple
years with AFS clients after fileservers are taken out of service.
Before that things seemed to work ok when doing fileserver
moves and
rebuilding. All data was moved off the fileserver but the clients
still seem to have some need to talk to it. In the past the AFS
admins have left the fileservers up and empty for a number of
days to try to resolve this issue - but it doesn't resolve the
issue.
For example a recent example:
The fileserver reno.helios.nd.edu was shutdown after all data
moved off of it. However the client still can't get to
a number of AFS files.
[root@xeon109 root]# fs checkservers
These servers unavailable due to network or server problems:
reno.helios.nd.edu.
[root@xeon109 root]# cmdebug reno.helios.nd.edu -long
cmdebug: error checking locks: server or network not responding
cmdebug: failed to get cache entry 0 (server or network
not responding)
[root@xeon109 root]# cmdebug reno.helios.nd.edu
cmdebug: error checking locks: server or network not responding
cmdebug: failed to get cache entry 0 (server or network
not responding)
[root@xeon109 root]#
[root@xeon109 root]# vos listvldb -server reno.helios.nd.edu
VLDB entries for server reno.helios.nd.edu
Total entries: 0
[root@xeon109 root]#
on the client:
rxdebug localhost 7001 -version
Trying 127.0.0.1 (port 7001):
AFS version: OpenAFS 1.2.11 built 2004-01-11
This is a linux 2.4 client and I don't have kdump - have
also had these
problems on sun4x_58 clients too.
I should mention that we've seen some correlation
to this happening on machines with "busy" AFS caches -
which makes it
even more frustrating as it seems to affect machines which
depend on
AFS the most. We've tried lots of fs flush* * -
So far we've ended up rebooting which does fix the
problem.
Does anyone have any clues what the problem is or what a workaround
might be?
Thanks
Rich
--
Rich Sudlow
University of Notre Dame
Office of Information Technologies
321 Information Technologies Center
PO Box 539
Notre Dame, IN 46556-0539
(574) 631-7258 office phone
(574) 631-9283 office fax
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info