[OpenAFS] Help: intermittent fileservice hangs

Derrick J Brashear shadow@dementia.org
Tue, 3 Feb 2004 22:51:01 -0500 (EST)


I read the list, you can stop CCing me on the reply any time now... ;-)

On Tue, 3 Feb 2004, Tracy Di Marco White wrote:

> >communication failures on vos releases aren't your network. they're the
> >fileserver breaking callbacks inline during the release, and the single
> >threaded interface between the volserver and the fileserver.
>
> What causes it?  vos moves failing within a server are the same?

The second question above fails to parse.

The fileserver breaks callbacks inline when the volserver "gives back" a
volume after moving it. Depending what version you have either
1) the volserver times out connections including the vos client that
called in the "move" because the fileserver takes too long answering while
it's waiting for clients that went away to time out
2) the volservers get a quick reply from the fileserver, which then
becomes busy breaking callbacks and won't answer the volserver if it comes
back shortly thereafter for another request, with basically the same
result as above but on the next transaction

It's been discussed. It's in the archives. The CVS head version was
reworked to deal differently.