[OpenAFS] OpenAFS hiccups for multiple users on Win 7 Enterprise

Tobias Vockerodt vockerodt@iqo.uni-hannover.de
Thu, 10 Feb 2011 01:23:38 +0100


Dear all,

two weeks ago I wrote here about a Win 7 OpenAFS problem and got 
instructions. However, the problem was not repeatable since now. Let me 
summarize:

System: Windows 7 Enterprise 64 Bit/OpenAFS 1.5.7800/MIT Kerberos

At a certain time, access to our root cell fails while other cells are 
still reachable. The outage occurs on a single client and appears to be 
related to multiple users logging on to AFS local + remote.

Following Jeffreys kind instructions, I obtained an afsd.log of one 
failed access attemp to our root volume. I placed it here:
http://pastebin.de/14795

I recon that "CM_ERROR_ALLDOWN (VL Server)" indicates that server/client 
communication is down. The mailing list archives just tell me about a 
bug in 1.4.something. Could you have a look at the log?

Many thanks for advice,
Tobias Vockerodt


Am 25.01.2011 11:12, schrieb Jeffrey Altman:
> On 1/25/2011 4:10 AM, Tobias Vockerodt wrote:
>> Dear Jeffrey,
>>
>>> Given that you are using Windows 7 and establishing a new remote desktop
>>> connection at the time the outage occurs I suspect that you are
>>> experiencing the Windows 7 netbios name lookup bug that is described in
>>> the OpenAFS 1.5.78 release notes.  During such an outage I would expect
>>> attempts to access paths in \\AFS to fail with "network name not found".
>>>   That would be the Microsoft smb redirector failing to find the smb file
>>> server named "AFS".
>>
>> here is what we found: obtaining an AFS token is no problem. Access to
>> \\afs is also possible. As I found out just this morning, other cells
>> can be accessed without any problems (tested with cell openafs.org).
>> However, our cell here seems to be unavailable from that specific
>> machine for something like 8-9 hours.
>
> You need to determine which operation is failing and what the failure
> code is.  This can be done either with the afsd_service trace logging or
> with wireshark captures.
>
> To use afsd_service trace logging, when your cell becomes unavailable:
>
>    fs trace -on -reset
>    try to access the cell root volume
>    fs trace -dump -off
>    place the resulting %windir%\temp\afsd.log file somewhere accessible
>
> To use wireshark:
>
>    fs setcrypt off
>    start the wireshark capture
>    try to access the cell root volume
>    stop the wireshark capture
>    fs setcrypt on
>    save the capture file somewhere accessible
>
> Jeffrey Altman
>