[OpenAFS-devel] Got one of those interesting "bunch of servers won't talk to a client" situations right at the moment...

Jeffrey Hutzelman jhutz@cmu.edu
Fri, 18 Mar 2005 17:22:34 -0500


On Thursday, March 17, 2005 09:52:00 PM -0500 Derrick J Brashear 
<shadow@dementia.org> wrote:

> On Thu, 17 Mar 2005, Neulinger, Nathan wrote:
>
>> Client and server are both running builds from within the past few
>> months.
>>
>>
>> -bash-2.05b# /usr/afsws/bin/fs checks
>> These servers unavailable due to network or server problems:
>> afs-fs1.cc.umr.edu afs-fs17.cc.umr.edu afs-fs7.cc.umr.edu
>
> I would have suggested it was the bug Tom Keiser sent us a patch for in
> 1.3.79 but...
>
>> In a network trace, a few of the servers are sending back rx abort
>> packets.
>
> This suggests otherwise. Looking at your tcpdump output I think you might
> have more than one problem, possibly one which is this:
> http://www.openafs.org/cgi-bin/wdelta/STABLE14-fix-multirx-checkservers-2
> 0050216
> (explaining why valid replies are seemingly ignored)
>
>> 14:42:01.771456 afs-fs1.cc.umr.edu.afs3-fileserver >
>> sysinst.cc.umr.edu.afs3-callback:  rx abort (32)
>> 14:42:01.774793 afs-fs7.cc.umr.edu.afs3-fileserver >
>> sysinst.cc.umr.edu.afs3-callback:  rx abort (32)
>
>> afs-fs7.cc.umr.edu.afs3-fileserver:  rx data fs call get-time (32)
>> 14:42:04.858423 sysinst.cc.umr.edu.afs3-callback >
>> afs-fs1.cc.umr.edu.afs3-fileserver:  rx data fs call get-time (32)
>> 14:42:05.321829 afs-fs1.cc.umr.edu.afs3-fileserver >
>> sysinst.cc.umr.edu.afs3-callback:  rx abort (32)
>> 14:42:05.835009 afs-fs7.cc.umr.edu.afs3-fileserver >
>> sysinst.cc.umr.edu.afs3-callback:  rx abort (32)
>
> And this is something else. Can I see raw tcpdump output? I want to look
> more clossely at the aborts.

By "raw", I assume Derrick means as produced by 'tcpdump -s 1500 -x'.
Which would indeed be interesting.

-- Jeff