[OpenAFS] AFS hangs, possible nat issues?

Mark Huijgen mark@nl.simpc.com
Thu, 20 May 2010 16:30:39 +0200


On 05/20/2010 03:56 PM, Derrick Brashear wrote:
>>
>>>> The number of entries returned by running 'rxdebug localhost 7001
>>>> -allconnections' on the client seems to grow with the number of packets
>>>> sent every 20s to each server (see attachment, ip's replaced with short
>>>> hostnames to match tcpdump output).
>>>>
>>>> vlserver pings do seem to stop when the connection to the vlserver is
>>>> destroyed, just not the fileserver ones.
>>>>
>>>>         
>>> Connections to the fileserver get replaced. It's an artifact of how
>>> the cache manager tracks servers.
>>>
>>>       
>> Does replace mean the old connection will(should?) be destroyed together
>> with the scheduled natping for it?
>>     
> An old connection is destroyed only when a new one is created. Again,
> one per auth context.
>
> You can't use a destroyed connection for a nat ping, for obvious reasons.
>
> I could tune this slightly.
>   
I'll be happy to test any changes.

I just checked an unmodified client which has been running for 70 days
and 'rxdebug -allcons' shows 3411 connections to our 5 fileservers, if
all of these would cause a ping packet to be sent...

Mark Huijgen