[OpenAFS] perpetual Connection timed out

Wesley Chow wchow@athenacr.com
Wed, 19 Mar 2008 16:28:24 -0400


Todd DeSantis wrote:
> Hi -
> 
> If the problem is happening when you are trying to cd into
> a volume, then this is probably a case where the linkData
> field of the vcache structure has somehow become corrupted.
> 
> The "fs flush" and "fs flushv" commands will NOT address this
> part of the flushing.

Right, I tried this on both the volume and the parent volume.

> If you have the
> 
> fs flushmount <path to volume>
> 
> command, this will fix that problem.

I'll try this next time. The problem disappeared after a reboot, so I no
longer have a broken system to debug.


Thanks,
Wes


> 
> Thanks
> 
> Todd
> 
> Inactive hide details for Jeffrey Altman
> <jaltman@secure-endpoints.com>Jeffrey Altman <jaltman@secure-endpoints.com>
> 
> 
>                         *Jeffrey Altman <jaltman@secure-endpoints.com>*
>                         Sent by: openafs-info-admin@openafs.org
> 
>                         03/19/2008 02:24 PM
>                         Please respond to
>                         jaltman@secure-endpoints.com
> 
> 	
> 
> To
> 	
> "Christopher D. Clausen" <cclausen@acm.org>
> 
> cc
> 	
> Wesley Chow <wchow@athenacr.com>, openafs-info@openafs.org
> 
> Subject
> 	
> Re: [OpenAFS] perpetual Connection timed out
> 
> 	
> 
> 
> Christopher D. Clausen wrote:
>> Wesley Chow <wchow@athenacr.com> wrote:
>>> Mike Garrison wrote:
>>>> On Mar 19, 2008, at 12:26 PM, Wesley Chow wrote:
>>>>> On a few of our clients (running 1.4.1), we sometimes get
>>>>> "Connection timed out" with a single volume. Other volumes on the
>>>>> same server are
>>>> 1.4.1 is almost 2 years old. Have you tried upgrading? 1.4.6 is
>>>> recent.
>>> Yep, I'll do that. I was just hoping there was a "bos restart"-like
>>> command for clients that I could use in the meantime. It's not a
>>> common problem anyway, so I'll just upgrade.
>>
>> fs checks; fs checkv
> 
> fs checkserver won't help because the server is already
> responding to queries for other volumes.
> 
> fs checkvolume might help if the problem is that the
> cache manager is confused on which server the volume
> is located on.
> 
> When the problem occurs I would execute "cmdebug <host> -long" and
> find the FID of the mount point and the volume and see what its
> status is.
> 
> Then I would try executing "fs flushvolume" against both the volume
> containing the mountpoint and the volume that is exhibiting the
> problem.
> 
> Jeffrey Altman
> 
>