[OpenAFS] Fileserver in semi-meltdown state

Jeffrey Hutzelman jhutz@cmu.edu
Thu, 22 Jul 2004 20:57:05 -0400


On Tuesday, July 20, 2004 09:27:44 -0700 Renata Maria Dart 
<renata@slac.stanford.edu> wrote:

> Hi, I am currently experiencing slow or non-existent response time
> from one of our fileservers, running OpenAFS 1.2.11 on solaris 9.
> Vos commands hang and an ls of directories on that server also hangs.

So, I don't know if this is the cause of your current problem, but there is 
a known problem you will run into sooner or later on Solaris 9 which will 
cause the fssync interface to become non-responsive.  The symptoms are that 
volume operations requiring attaching volumes, including dump, move, 
release, and single-volume salvages, all will hang.

The fix is to apply the patch in DELTA rx-lwp-fdsetsize-20040708
Alternately, you can work around the problem by insuring that the 
fileserver is run with a _hard_ file descriptor limit of no more than 1024.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA