[OpenAFS-devel] Regarding volserver hangs in 1.2.X
Jeffrey Hutzelman
jhutz@cmu.edu
Wed, 24 Aug 2005 16:45:35 -0400
On Wednesday, August 24, 2005 04:02:31 PM -0400 William Setzer
<William_Setzer@ncsu.edu> wrote:
> /usr/afs/logs # pstack 975
> 975: /usr/afs/bin/volserver
> ff19e89c read (3, 1e6663, 1)
> 0003c02c FSYNC_askfs (201a52c2, 1e68f8, 95400, 2, 201a52c2, 1) + 88
> 00038e50 VAttachVolumeByName_r (1e697c, 1e68f8, 4c, 2, 6570, 6) + 1f8
> 00038c48 VAttachVolumeByName (1e697c, 1e68f8, 1e68e4, 2, 0, 94c00) + 10
> 00026f7c XAttachVolume (1e697c, 201a52c2, 4, 2, ab000, 16) + 60
> 000284c8 VolTransCreate (3bc9d0, 201a52c2, 4, 2, 1e6aa0, c0400) + b0
A volserver blocked here is waiting for an answer from the fileserver to a
request it sent via the fssync protocol. The most common case for this is
when _returning_ a volume to the fileserver, because in 1.2.x the
fileserver won't respond to such a request until it has broken all of the
callbacks. That was indeed the problem described in the thread you
mentioned, and it was resolved in the development branch. If you are
seeing this problem, you should consider upgrading one or more servers to
1.4.0rc1, which should not have this problem.
However, the backtrace you posted actually points to a different problem,
which is described in ticket #5615. This problem happens to affect only
newer Solaris systems, and is fixed in 1.2.12.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA