[OpenAFS] Re: 1.6.0pre2 - more vos issues, possible bug

Andrew Deason adeason@sinenomine.net
Wed, 2 Mar 2011 12:36:47 -0600


On Wed, 2 Mar 2011 00:24:15 -0500 (EST)
Andy Cobaugh <phalenor@gmail.com> wrote:

> > fssync-debug should detect a DAFS fileserver and execute
> > dafssync-debug for you.
> 
> If I just do fssync-debug, it tells me this:

Yeah, apparently that change just missed the branchpoint. It will be
able to do that in the future.

> >> If I look in FileLog.old (I restarted at some point to up the debug
> >> level), I see these lines:
> >
> > You can change that with SIGHUP/SIGTSTP (unless you're doing that
> > for a permanent change).
> 
> Is that to increase/decrease logging level, respectively?

TSTP increases the level to 1, 5, 25, 125. HUP resets it to 0. See
fileserver(8).

> > That's RX_CALL_TIMEOUT, which I'm not used to seeing on volserver
> > RPCs... Do you know how long it took to error out with that? If it
> > takes a while, a core of the volserver/fileserver while it's hanging
> > would be ideal. It might just be the fileserver trying to salvage
> > the volume a bunch of times or something, though, and that takes too
> > long.
> 
> From the start of the vos backup command until it returned was 16s 
> according to our logs.

Hmmm. I think the only way that can happen that quickly is we sent an
abort with code -3, or maybe an rx busy was sent. If you can get this to
happen predictably enough, a traffic dump may explain it.

-- 
Andrew Deason
adeason@sinenomine.net