[OpenAFS] Problems on AFS Unix clients after AFS fileserver moves

Dexter 'Kim' Kimball dhk@ccre.com
Tue, 9 Aug 2005 14:59:02 -0600


fs checkv will cause the client to discard what it remembers about volumes.
Did you try that?

Kim


     -----Original Message-----
     From: openafs-info-admin@openafs.org 
     [mailto:openafs-info-admin@openafs.org] On Behalf Of Rich Sudlow
     Sent: Tuesday, August 09, 2005 9:58 AM
     To: openafs
     Subject: [OpenAFS] Problems on AFS Unix clients after AFS 
     fileserver moves
     
     
     We've been having problems with our cell for the last couple
     years with AFS clients after fileservers are taken out of service.
     Before that things seemed to work ok when doing fileserver 
     moves and
     rebuilding. All data was moved off the fileserver but the clients
     still seem to have some need to talk to it.  In the past the AFS
     admins have left the fileservers up and empty for a number of
     days to try to resolve this issue -  but it doesn't resolve the
     issue.
     
     For example a recent example:
     
     The fileserver reno.helios.nd.edu was shutdown after all data
     moved off of it.  However the client still can't get to
     a number of AFS files.
     
     [root@xeon109 root]# fs checkservers
     These servers unavailable due to network or server problems: 
     reno.helios.nd.edu.
     [root@xeon109 root]# cmdebug reno.helios.nd.edu -long
     cmdebug: error checking locks: server or network not responding
     cmdebug: failed to get cache entry 0 (server or network 
     not responding)
     [root@xeon109 root]# cmdebug reno.helios.nd.edu
     cmdebug: error checking locks: server or network not responding
     cmdebug: failed to get cache entry 0 (server or network 
     not responding)
     [root@xeon109 root]#
     
     [root@xeon109 root]#  vos listvldb -server reno.helios.nd.edu
     VLDB entries for server reno.helios.nd.edu
     
     Total entries: 0
     [root@xeon109 root]#
     
     on the client:
     rxdebug localhost 7001 -version
     Trying 127.0.0.1 (port 7001):
     AFS version:  OpenAFS 1.2.11 built  2004-01-11
     
     
     This is a linux 2.4 client and I don't have kdump - have 
     also had these
     problems on sun4x_58 clients too.
     
     I should mention that we've seen some correlation
     to this happening on machines with "busy" AFS caches  - 
     which makes it
     even more frustrating as it seems to affect machines which 
     depend on
     AFS the most. We've tried lots of fs flush* * -
     So far we've ended up rebooting which does fix the
     problem.
     
     Does anyone have any clues what the problem is or what a workaround
     might be?
     
     Thanks
     
     Rich
     
     -- 
     Rich Sudlow
     University of Notre Dame
     Office of Information Technologies
     321 Information Technologies Center
     PO Box 539
     Notre Dame, IN 46556-0539
     
     (574) 631-7258 office phone
     (574) 631-9283 office fax
     
     _______________________________________________
     OpenAFS-info mailing list
     OpenAFS-info@openafs.org
     https://lists.openafs.org/mailman/listinfo/openafs-info