[OpenAFS-devel] debugging connection loss

Horst Birthelmer horst@riback.net
Tue, 16 Aug 2005 14:38:02 +0200


On Aug 16, 2005, at 1:15 PM, Roland Kuhn wrote:

> Dear experts!
>
> Some of the clients in my cell sometimes lose the connection to one  
> of the fileservers. This usually happens while processing large  
> (1.5GB) data files which reside on that server. The symptom is that  
> 'fs checks' says no connection and ethereal shows that while RX  
> packets are sent to the server, the server always answers with RX  
> abort datagrams. The only immediate fix I have found is to stop/ 
> restart the openafs-client (Debian sarge, 1.3.81, on clients and  
> servers). What can I do to further debug the cause of this? The  
> problem is that this phenomenon eats batch jobs by the hundreds.

If you could specify your system and AFS version a little bit more,  
maybe somebody can help ;-)

Horst