[OpenAFS] Re: Investigating 'calls waiting' from rxdebug

drosih@rpi.edu drosih@rpi.edu
Wed, 28 Aug 2013 14:08:31 -0400


On Sun, 18 Aug 2013 21:11:41 EDT drosih@rpi.edu wrote:
> 
> What I intended to do, once I get good timeframe to do it in:
> 
>   bos status $HNAME -long -localauth
>   bos stop   $HNAME -wait -localauth
>   bos delete $HNAME fs    -localauth
>   bos create $HNAME fs fs "/usr/afs/bin/fileserver -L -p 192" ...etc...
>   bos status $HNAME -long -localauth

In case anyone comes across this thread in the future, I did get
around to doing this on three of our file servers this morning.
One minor mistake in the above was that the 'bos stop' command
needs to say which service to stop, namely the 'fs' service.
Other than that it worked fine.

I increased the number to 72, instead of 192.  Even when cell-wide
performance was horrible for us, the number of calls waiting for a
thread rarely got above 60.  And I've been monitoring the number
of threads-in-use for over a week now.  In all that time it was
pretty rare that we had 5 threads in use, and we never had more
than 5.

So we still have no idea what triggered the three extended periods
of serious performance problems earlier this month, but we should
be in better shape if it happens again.

Thanks for helping me with this minor crisis.

-- 
Garance Alistair Drosehn
Senior Systems Programmer
Rensselaer Polytechnic Institute;  Troy NY