[OpenAFS-devel] "Lost contact with file server" problems

Roland Kuhn rkuhn@e18.physik.tu-muenchen.de
Sat, 27 Aug 2005 11:12:43 +0200 (CEST)


Hi Derrick!

Thanks for looking into this!

On Fri, 26 Aug 2005, Derrick J Brashear wrote:

> On Mon, 22 Aug 2005, Roland Kuhn wrote:
>
>> Hi folks!
>> 
>> On Sun, 21 Aug 2005, Derrick J Brashear wrote:
>> 
>>> it needs to include the first error packet, e.g. the window where it loses 
>>> contact, to be useful
>>> 
>> Okay, it happened again, and I have a full trace:
>> 
>> http://www.e18.physik.tu-muenchen.de/~rkuhn/openafs-fail-trace.cap
>> http://www.e18.physik.tu-muenchen.de/~rkuhn/openafs-fail-trace-end.cap
>> 
>> The latter contains only the last 81 frames and begins a few frames before 
>> the request which fails. The former is 10MB in size. If you need more 
>> history, I also have the last 1GB of the connection available. 192.168.18.2 
>> is the server, 192.168.18.39 the client. The access is for big files 
>> typically.
>
> Except you missed the abort from the server to the client 2 minutes earlier
>
> 05:43:40.773551 IP (tos 0x0, ttl  64, id 6836, offset 0, flags [none], 
> length: 60) 192.168.18.2.7000 > 192.168.18.39.7001: [udp sum ok]  rx abort 
> cid 1dd424ec call# 0 seq 0 ser 13 (32)
>
Well, what does this mean? I'm no RX expert...

Ciao,
 					Roland