[OpenAFS-devel] Re: idle dead timeout processing in clients

Thu, 8 Dec 2011 19:21:04 -0600

On Thu, 08 Dec 2011 14:41:18 -0800
Russ Allbery <rra@stanford.edu> wrote:

> > The current behavior is deliberate, and so is easy to change. The
> > client currently waits for a VRESTARTING error to clear up; it's a
> > simple matter of adding a client option to instead make it error out
> > immediately, if that's what you want. That makes server restarts
> > very visible to processes, though.
> 
> In the absence of demand-attach, I don't see how a server restart
> could ever not be visible to processes.  It takes over a half-hour.
> (Although I suppose this varies based on how many volumes you have per
> server and the like.)

Yeah, it's not like that for everybody. (That's one of the "selling
points" of DAFS, except when someone tells you their restart time is
already only a few seconds...)

And, well, "visible" in a different sense. If it takes 20 minutes for a
read() to return, it's not visible in the sense that the application
needs a code path to deal with it; AFS isn't "down" but arguably just
"slow". If it takes 5 seconds for read() to return, but it returns -1
with ETIMEDOUT, for some environments that's worse / more visible. I've
had someone seem completely baffled when they were told that not
everyone runs AFS with hardmount turned on; that not only is that
behavior optional, but defaults to 'off'.

Not that I'm arguing that that's "better" behavior or anything; just
different requirements at different places.

-- 
Andrew Deason
adeason@sinenomine.net