[OpenAFS] 1.4.0 Solaris 10 sparc client hang
Christopher D. Clausen
cclausen@acm.org
Mon, 7 Nov 2005 10:33:54 -0600
The AFS client has hung on one of my AFS servers (E3000 running Solaris
10.) It has the 1.4.0 binaries from the openafs.orgr website installed.
The client hung on a cp operation from afs to the local disk.
rxdebug returns:
C:\>rxdebug afs2 7001
Trying 128.174.251.9 (port 7001):
Free packets: 130, packet reclaims: 5, calls: 422, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
0 threads are idle
67108864 calls have waited for a thread
Connection from host 128.174.251.9, port 7000, Cuid 90cc8e26/da49060
serial 19, natMTU 1444, security index 0, server conn
call 0: # 9, state active, mode: error
call 1: # 0, state not initialized
call 2: # 0, state not initialized
call 3: # 0, state not initialized
Done.
I assume that "mode: error" is indicative of bad things. It has been in
this state for quite some time (several hours.) Attempts to umount /afs
have been unsuccessful (they hang as well.)
I am vos moving volumes off of the server (the server processes seem
unaffected) to eventually reboot it. My question is, what would provide
the most information to further debug this? Just panic the system from
firmware and create a dump? Or should I attempt to attach a debugger
and see where the process is stuck? Any advice / suggestions / pointers
to idiots guides on debugging Solaris would be appreciated.
<<CDC
--
Christopher D. Clausen
ACM@UIUC SysAdmin