[OpenAFS] Re: Need volume state / fileserver / salvage knowledge

Andrew Deason adeason@sinenomine.net
Mon, 31 Jan 2011 11:30:30 -0600

On Mon, 31 Jan 2011 11:54:24 -0500
Steve Simmons <scs@umich.edu> wrote:

> I haven't read the code, but by observing the logfiles during a
> shutdown time it appears that fs shutdown break callbacks in a
> single-threaded manner per partition. This could probably be
> parallelized; simple thought experiments say X parallel callback
> breaks would result in run time T reduced to T/X.

I have said this before and I will continue to say it: we do not break
callbacks on volume shutdown. We reset the client callback state on the
next client access after the server comes back up (for non-DAFS).

What we _do_ do is wait for existing client connections and callback
breaks to complete before we can shut down. There are several causes of
callback breaks to be initiated, but a fileserver restart/shutdown is
not one of them.

If you want to improve shutdown time, DAFS will help just for the
portion where disk is the bottleneck. If you want to "kick off" clients
during shutdown, so clients holding open a connection don't block a
shutdown, take a look at the code that adds the
-offline-shutdown-timeout parameter (which is on master, gerrit 2984).
That functionality is not implemented for callback-related calls, but it
could be with more work.

Andrew Deason