[OpenAFS] Re: Chronic blocked connections on fileserver

Will Maier willmaier@ml1.net
Tue, 9 Oct 2007 15:20:01 -0500


On Tue, Oct 09, 2007 at 04:07:06PM -0400, Jim Rees wrote:
> Will Maier wrote:
> > On a whim, I thought I'd take a look at the hosts hitting us
> > during a representative bad period.
> 
> That's not as interesting as which hosts are blocking the file server.  Some
> things to check for:
> 
> file server is hung trying to talk to pts
> file server is hung trying to recall callbacks from unresponsive clients
> 
> There may be clues in your FileLog.  I'd be suspicious of 128.104.3.108 and
> 128.104.3.109.  They are unreachable right now, at least from here.  If they
> are doing thousands of Fetchstatuses but can't be reached by the file server
> for callbacks, that could foul things up.  Is that subnet behind a firewall?

I'm not sure. I'll contact the folks responsible for those machines
and see what they know.

> Your message was delayed to the list by a couple of weeks.  Is the
> 1.4.4 file server any better now than the 1.4.1?

1.4.4 has been better than 1.4.1, but we still experienced a week or
so of badness about a week after the upgrade. Since then (ie, the
last 10 days or so), we haven't had any problems.

As Derrick says, we'll probably want to upgrade to 1.4.5 when it's
available. For now, I can't find any evidence of misconfigured
clients, but I'll be sure to look harder for them next time.

Thanks!

-- 

[Will Maier]-----------------[willmaier@ml1.net|http://www.lfod.us/]