[OpenAFS-devel] Solaris 10 6/06 (update 2) crash (not the
gafs_rename() one)
William Setzer
William_Setzer@ncsu.edu
Fri, 21 Jul 2006 10:25:33 -0400
Jeffrey Hutzelman <jhutz@cmu.edu> writes:
: <William_Setzer@ncsu.edu> wrote:
:
: > I'm using the OpenAFS 1.4.1 distribution pre-compiled for Solaris 10
: > sparc. Under Solaris 10 update 2, I get the following panic if the
: > machine is running as an NFS server and some (unknown) NFS request
: > comes in:
:
: How long does it take this to happen?
It doesn't take long at all. I'm mounting a remote root image (boot
net:dhcp -s) and it happens fairly early in the boot sequence.
: Can you use tcpdump to capture the NFS traffic and figure out what
: request is triggering this?
Yep. It's available at http://www4.ncsu.edu/~wsetzer/soldump.out if
you want to look at it. As best I can tell, it seems to happen on the
first NFS lookup:
09:59:47.968645 IP 52.1.4.163.2049 > 152.1.4.165.3706724788: reply ok 168
09:59:47.969417 IP 152.1.4.165.3706724789 > 152.1.4.163.2049: 136 lookup fh 136,6/85559 "devices"
09:59:51.684025 IP 152.1.4.165.3706724790 > 152.1.4.163.2049: 136 lookup fh 136,6/85559 "platform"
09:59:52.783772 IP 152.1.4.165.3706724789 > 152.1.4.163.2049: 272 lookup fh 136,6/85559 "devices"
10:00:02.413816 IP 152.1.4.165.3706724789 > 152.1.4.163.2049: 272 lookup fh 136,6/85559 "devices"
10:00:21.663896 arp who-has 152.1.4.163 (ff:ff:ff:ff:ff:ff) tell 152.1.4.165
10:00:22.663741 arp who-has 152.1.4.163 (ff:ff:ff:ff:ff:ff) tell 152.1.4.165
The "reply ok" packet is the last packet sent by the crashing machine,
and the "lookup /devices" is the first NFS lookup in the dump.
: Are you running the AFS/NFS translator, or is the NFS server unrelated to
: AFS?
It appears to be unrelated. I'm not running the AFS/NFS translator,
but I'm using the "libafs64.o" kernel module. When I do this with the
"libafs64.nonfs.o" module, the machine does not crash (as you would
probably expect).
William