[OpenAFS] Fail over to replica sites
Nathan Neulinger
nneul@umr.edu
08 Aug 2002 21:31:19 -0500
On Thu, 2002-08-08 at 21:01, Russ Allbery wrote:
> Nathan Neulinger <nneul@umr.edu> writes:
>
> > Yes. It's not reproducible though. I have yet to be able to "do"
> > anything to the file/vol servers to trigger the symptom.
>
> > Note - I have not seen it when the server really cleanly goes down. In
> > those cases, it fairly reliably switches. I have however seen the
> > problem numerous times when a file server starts to not respond for some
> > reason. However, it must be responding to some stuff, cause it doesn't
> > ever completely go down. If I kill -STOP the fileserver, the clients see
> > it instantaneously. (Quicker in my case with the RX_DEADTIME being
> > small.) Immediate response on most clients to the -CONT as well.
>
> In this case, the server just went away completely without any warning.
> (Basically, the machine was powered off by accident.) Many of our clients
> didn't recover and see the replicated volumes located on that server until
> the server came back up (and they were pointing to the read-only path and
> should have been able to find one of the other two replicas).
I'll have to try that with a one of our test servers and see if yanking
the ethernet cable results in a similar response. I figured that a -STOP
would yield that result, but apparently not.
What's your networking environment? All switched? All clients or just
some of them?
> > In our cases though, it sometimes doesn't ever get to the 'connection
> > timed out' point... It just hangs forever.
>
> I've not seen that myself. This was more what I'd expect when a
> read/write server was down. When you tried to access something that was
> replicated on that server, the system would respond "connection timed out"
> immediately. There was no delay; it was obvious that it had cached that
> the system was down and wasn't retrying network access.
-- Nathan
------------------------------------------------------------
Nathan Neulinger EMail: nneul@umr.edu
University of Missouri - Rolla Phone: (573) 341-4841
Computing Services Fax: (573) 341-4216