[OpenAFS-devel] "Lost contact with file server" problems

Roland Kuhn rkuhn@e18.physik.tu-muenchen.de
Mon, 29 Aug 2005 08:54:42 +0200 (CEST)


Hi Jeff!

On Sun, 28 Aug 2005, Jeffrey Hutzelman wrote:

>
>
> On Sunday, August 28, 2005 16:30:41 -0400 Derrick J Brashear 
> <shadow@dementia.org> wrote:
>
>> From jhutz, try this:
>> --- rx.c        30 May 2005 04:55:26 -0000      1.82
>> +++ rx.c        28 Aug 2005 20:30:00 -0000
>> @@ -1146,7 +1146,11 @@
>> 
>>       /* Client is initially in send mode */
>>       call->state = RX_STATE_ACTIVE;
>> -    call->mode = RX_MODE_SENDING;
>> +    call->error = conn->error;
>> +    if (call->error)
>> +       call->mode = RX_MODE_ERROR;
>> +    else
>> +       call->mode = RX_MODE_SENDING;
>> 
>>       /* remember start time for call in case we have hard dead time
>> limit */
>>       call->queueTime = queueTime;
>
> Yeah; that's functionally equivalent to the patch I sent -- Derrick and I 
> reconstructed this over the phone when it became apparent my message was 
> stuck in a mail queue somewhere.
>
> Note that while I think this may fix the problem, I don't have an easy way to 
> reproduce the problem and test the fix.  We'll also need a certain amount of 
> testing to be convinced this doesn't have any unintended side-effects.
>
Thanks! Now I can try something ;-) I'll give it a run at my cluster, so 
we should know in a few days if it's fixed. Does this patch modify the 
kernel module, the afsd or both?

Ciao,
 					Roland