[OpenAFS] Re: UDP timeouts

Andrew Deason adeason@sinenomine.net
Mon, 16 May 2011 16:47:35 -0500


On Mon, 16 May 2011 23:19:23 +0200
Jaap Winius <jwinius@umrk.nl> wrote:

> After almost a day of operations with your patch in place and both UDP
> timeout values set to 3600 seconds, there were just a few dropped
> packets. Not having noticed that immediately (it was on the one server
> I didn't check), I reduced the values further to 1800 seconds, but
> then saw many (more) dropped packets. I've now increased the timeout
> values to 4500 seconds and expect to see few if any more dropped
> packets. Your patch has indeed allowed me to reduce the UDP timeout
> values significantly, although 20 minutes doesn't seem to be within
> reach yet.

If you want to look at this further, capturing network traffic to/from
an idle client that triggers this would help say why. I would guess it
should be triggerable from having a client idle for maybe 40 minutes or
an hour and then contacting the server again. (But I'm guessing, since
we don't know what the specific traffic is that's getting dropped.) But
if you're happy with the timeouts as they are now...

Or, if you turn the fileserver debugging up to at least 2 (if you
haven't given the fileserver a debug level on startup, sending it a TSTP
signal will turn it up to 1, and another will turn it up to 5; sending
it a HUP will reset it to 0), you could see how often you see this
message:

Checking for dead venii & clients

Since it is possible the client checks are just taking long enough that
it is extending that 20 minute figure I said earlier. Ideally it appears
every 5 minutes. But you also get a lot of other messages at that debug
level, so it's up to you if you want to.

This will be addressed differently in 1.6 servers and clients and
beyond, so I wouldn't worry so much about leaving an issue unresolved,
if you don't wanna look into it further.

-- 
Andrew Deason
adeason@sinenomine.net