[OpenAFS] Debugging Linux AFS client when client hangs

John Perkins john@cs.wisc.edu
Thu, 29 Oct 2009 17:34:36 -0500


We're dealing with an interesting situation at our site recently: after 
rolling out RHEL 5 update 4
our department's Linux computers, we're finding certain applications 
seem to cause AFS to
no longer respond when fetching contents of specific directories in 
AFS.  Access to local
filesystems in this state appears to work just fine.

Simon was kind enough to provide useful instructions at
http://blob.inf.ed.ac.uk/sxw/2009/01/24/using-fstrace-to-debug-the-afs-cache-manager/
back in January...unfortunately, the fstrace process gets stuck in 
device wait and will not
return any useful information. 

If I could only get some useful debugging information, I would gladly 
submit it to RT...

Any suggestions from the gurus out there for suggestions on useful 
debugging information to
narrow down the cause of this crash would be helpful.  I do have one 
crash dump of a system
in this state, although I wasn't able to clean much out of it so far.

--
============================================================================
   John Perkins                   |   University of Wisconsin-Madison
   Researcher                     |   Department of Computer Science
   john@cs.wisc.edu               |   1210 W. Dayton St.
   608-262-0438/608-262-6626 FAX  |   Madison, WI  53706-1685
============================================================================