[OpenAFS-devel] Got one of those interesting "bunch of servers
won't talk to a client" situations right at the moment...
Derrick J Brashear
shadow@dementia.org
Thu, 17 Mar 2005 21:52:00 -0500 (EST)
On Thu, 17 Mar 2005, Neulinger, Nathan wrote:
> Client and server are both running builds from within the past few
> months.
>
>
> -bash-2.05b# /usr/afsws/bin/fs checks
> These servers unavailable due to network or server problems:
> afs-fs1.cc.umr.edu afs-fs17.cc.umr.edu afs-fs7.cc.umr.edu
I would have suggested it was the bug Tom Keiser sent us a patch for in
1.3.79 but...
> In a network trace, a few of the servers are sending back rx abort
> packets.
This suggests otherwise. Looking at your tcpdump output I think you might
have more than one problem, possibly one which is this:
http://www.openafs.org/cgi-bin/wdelta/STABLE14-fix-multirx-checkservers-20050216
(explaining why valid replies are seemingly ignored)
> 14:42:01.771456 afs-fs1.cc.umr.edu.afs3-fileserver >
> sysinst.cc.umr.edu.afs3-callback: rx abort (32)
> 14:42:01.774793 afs-fs7.cc.umr.edu.afs3-fileserver >
> sysinst.cc.umr.edu.afs3-callback: rx abort (32)
> afs-fs7.cc.umr.edu.afs3-fileserver: rx data fs call get-time (32)
> 14:42:04.858423 sysinst.cc.umr.edu.afs3-callback >
> afs-fs1.cc.umr.edu.afs3-fileserver: rx data fs call get-time (32)
> 14:42:05.321829 afs-fs1.cc.umr.edu.afs3-fileserver >
> sysinst.cc.umr.edu.afs3-callback: rx abort (32)
> 14:42:05.835009 afs-fs7.cc.umr.edu.afs3-fileserver >
> sysinst.cc.umr.edu.afs3-callback: rx abort (32)
And this is something else. Can I see raw tcpdump output? I want to look
more clossely at the aborts.