[OpenAFS] Windows client complaints.

Anders Magnusson ragge@ltu.se
Thu, 07 Oct 2010 14:12:31 +0200


Hi Jeffrey,

Jeffrey Altman wrote:
> On 10/7/2010 5:20 AM, Anders Magnusson wrote:
>   
>> We have a few (sometimes heavy loaded) WTS machines that have started to
>> get annoying
>> warnings in the application log after we upgraded client from 1.5.64 to
>> 1.5.77.
>>
>> Each 5 seconds it gets lines like this in the application log:
>>
>> Server 130.240.42.222 reported volume 538176929 as temporarily
>> unaccessible.
>> All servers are offline when accessing cell ltu.se volume 538176929.
>>
>> It always complain about a specific volume, but after some time it may
>> start complaining
>> about another volume instead.  Despite this, the volume is accessible
>> even though there are
>> delays when walking around in the volume.
>>
>> We haven't seen this on any other machines, but on three WTS servers.
>> The WTS machines are running 2003R2 64-bit with the SMB AFS client.
>> No complains at all on the file servers.  Tested with complaining volumes
>> on both 1.4.11 and 1.4.12.1 file servers.
>>
>> Any hints?  Because this only occurs if there is quite some load on the
>> terminal
>> servers it's not easy to debug (and unpopular :-)
>>
>> -- Ragge
>>     
>
> The errors were most likely present in the past.  There simply were no
> log messages for them before 1.5.75.  That warning message indicates
> that the file server returned a VIO error for the object being accessed.
>  If source is a read/write volume, there will be no other replicas and
> so all servers will be offline for that object.
>
> You will need to examine the file server logs to identify the reason the
> VIO error is being returned.
>   
There is nothing in the fileserver log concerning this volume at all.  
Even more notable
is that, for example, I got the app log messages at :40, :45 and :50, 
but nothing in the
fileserver log (turned up with one -TSTP) until :52, where it said

Thu Oct  7 14:01:52 2010 SRXAFS_FetchData, Fid = 538176929.7617.23824

This specific fileserver runs 1.4.12.1.

...and everything works without log messages from machines that are less 
loaded.

-- Ragge