[OpenAFS] connection timeouts

ted creedon tcreedon@easystreet.com
Tue, 16 May 2006 10:36:44 -0700


I've seen this and that problem was caused by the firewall.=20

Is there a firewall somewhere?

tedc

-----Original Message-----
From: openafs-info-admin@openafs.org =
[mailto:openafs-info-admin@openafs.org]
On Behalf Of Jeffrey Hutzelman
Sent: Tuesday, May 16, 2006 9:23 AM
To: Christof Hanke; Juha J=E4ykk=E4
Cc: openafs-info@openafs.org; Jeffrey Hutzelman
Subject: Re: [OpenAFS] connection timeouts



On Tuesday, May 16, 2006 09:20:28 AM +0200 Christof Hanke=20
<hanke@rzg.mpg.de> wrote:

> Juha J=E4ykk=E4 wrote:
>>> did you try a "fs checkserver" on the client while it thinks the =
server
>>> is down ? This should bring the connection up again without =
restarting
>>> the client.
>>> How long did you wait before restarting the client ?
>>> The client should do that all 3 min. or so.
>>> Is it always the same client ?
>>
>>
>> I waited a couple of *hours* and did everything I could think of:
>> checkserver, flush, flushmount, flushvolume etc. No effect. What I =
did
>> not do is look at netstat output to see if it thinks the connection
>> exists or not.

Note that netstat output will not help you here; it doesn't (can't) know =

about Rx connections, because they exist above the transport level.  As =
far=20
as the network stack is concerned, someone has a socket on some UDP port =

(for normal clients, port 7001), which handles all Rx traffic.

>> This has only occurred once, but I really do not wish
>> to see it ever again.
>>
>> Updating the client would be my last resort if this starts happening =
too
>> often and cannot be solved otherwise.

The client you're running is quite old; even if this is an issue that =
has=20
not been fixed in a newer version, I doubt anyone is going to be =
interested=20
in trying to produce a patch against 1.3.81.


> Hmm, then I can't help you much at the minute.
> Please try next time to catch a tcpdump while doing a "fs checkserv" =
as
> well as catching the fstrace output. A snippet of the fileserver log =
with
> a high log-level (=3D125) while doing the "fs checkserver" could be =
helpful
> as well.
> With this information, it might become clear why that has happened.

To get the fileserver log level to 125, send it SIGTSTP four times.
To turn debugging back off, send the fileserver SIGHUP.

Other things that might help in examining this...

- Output of 'cmdebug <client-ip-addr>'
- Output of 'rxdebug <client-ip-addr> 7001'
- Output of 'rxdebug <fileserver-ip-addr>'

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA

_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info