[OpenAFS] Problems on AFS Unix clients after AFS fileserver moves

Derrick J Brashear shadow@dementia.org
Tue, 9 Aug 2005 19:14:25 -0400 (EDT)


On Tue, 9 Aug 2005, Todd DeSantis wrote:


> - a kdump snapshot would have been able to give us some
>   information on the state of the client and could have
>   helped us determine if any volume and/or vcache entry
>   was still pointing at this old fileserver
>
>   Did you just not build kdump for the client, or does
>   OpenAFS not build kdump by default ?

OpenAFS 1.2.11 conceivably had a bug building kdump. Several times kdump 
builds have broken due to kernel changesm

>   their CStatd bit cleared.  This tells the client to run
>   a FetchStatus call to determine if my cached version is
>   still the correct version of the file/dir.
>
>   This is the way that the IBM Transarc clients work.  It is
>   possible that the OpenAFS code has changed the callback timing
>   a bit, I am not sure of this.

It's still the same.

> And one more thing to check is if OpenAFS changed any of the
> callback timing for volumes.

So's this.


>
>>
>> Kim
>>
>>
>>      -----Original Message-----
>>      From: openafs-info-admin@openafs.org
>>      [mailto:openafs-info-admin@openafs.org] On Behalf Of Rich Sudlow
>>      Sent: Tuesday, August 09, 2005 9:58 AM
>>      To: openafs
>>      Subject: [OpenAFS] Problems on AFS Unix clients after AFS
>>      fileserver moves
>>
>>
>>      We've been having problems with our cell for the last couple
>>      years with AFS clients after fileservers are taken out of service.
>>      Before that things seemed to work ok when doing fileserver
>>      moves and
>>      rebuilding. All data was moved off the fileserver but the clients
>>      still seem to have some need to talk to it.  In the past the AFS
>>      admins have left the fileservers up and empty for a number of
>>      days to try to resolve this issue -  but it doesn't resolve the
>>      issue.
>>
>>      For example a recent example:
>>
>>      The fileserver reno.helios.nd.edu was shutdown after all data
>>      moved off of it.  However the client still can't get to
>>      a number of AFS files.
>>
>>      [root@xeon109 root]# fs checkservers
>>      These servers unavailable due to network or server problems:
>>      reno.helios.nd.edu.
>>      [root@xeon109 root]# cmdebug reno.helios.nd.edu -long
>>      cmdebug: error checking locks: server or network not responding
>>      cmdebug: failed to get cache entry 0 (server or network
>>      not responding)
>>      [root@xeon109 root]# cmdebug reno.helios.nd.edu
>>      cmdebug: error checking locks: server or network not responding
>>      cmdebug: failed to get cache entry 0 (server or network
>>      not responding)
>>      [root@xeon109 root]#
>>
>>      [root@xeon109 root]#  vos listvldb -server reno.helios.nd.edu
>>      VLDB entries for server reno.helios.nd.edu
>>
>>      Total entries: 0
>>      [root@xeon109 root]#
>>
>>      on the client:
>>      rxdebug localhost 7001 -version
>>      Trying 127.0.0.1 (port 7001):
>>      AFS version:  OpenAFS 1.2.11 built  2004-01-11
>>
>>
>>      This is a linux 2.4 client and I don't have kdump - have
>>      also had these
>>      problems on sun4x_58 clients too.
>>
>>      I should mention that we've seen some correlation
>>      to this happening on machines with "busy" AFS caches  -
>>      which makes it
>>      even more frustrating as it seems to affect machines which
>>      depend on
>>      AFS the most. We've tried lots of fs flush* * -
>>      So far we've ended up rebooting which does fix the
>>      problem.
>>
>>      Does anyone have any clues what the problem is or what a workaround
>>      might be?
>>
>>      Thanks
>>
>>      Rich
>>
>>      --
>>      Rich Sudlow
>>      University of Notre Dame
>>      Office of Information Technologies
>>      321 Information Technologies Center
>>      PO Box 539
>>      Notre Dame, IN 46556-0539
>>
>>      (574) 631-7258 office phone
>>      (574) 631-9283 office fax
>>
>>      _______________________________________________
>>      OpenAFS-info mailing list
>>      OpenAFS-info@openafs.org
>>      https://lists.openafs.org/mailman/listinfo/openafs-info
>>
>>
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>
>
> --
> Rich Sudlow
> University of Notre Dame
> Office of Information Technologies
> 321 Information Technologies Center
> PO Box 539
> Notre Dame, IN 46556-0539
>
> (574) 631-7258 office phone
> (574) 631-9283 office fax
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>