[OpenAFS] Re: 1.4.x quorum election process?

Andrew Deason adeason@sinenomine.net
Fri, 23 Dec 2011 19:35:27 -0500


[Sorry for the old threads; I'm catching up on email I didn't have time
to read at the time.]

On Wed, 26 Oct 2011 14:25:56 -0400
Ken Hornstein <kenh@cmf.nrl.navy.mil> wrote:

> >I would object.  A quorum requirement is that all servers are in
> >agreement with the server configuration and the quorum algorithm.  Any
> >change to the quorum algorithm needs to be exposed as part of the
> >negotiation in order for servers to not get into a state where a
> >misconfigured server or a server executing with an alternate algorithm
> >does not result in a failure to achieve quorum.
> 
> While I agree with that in theory, we don't have that today;
> misconfigured servers can easily cause a quorum failure.  Also if the
> server times don't match up that can easily cause a quorum failure
> (I'd classify that as a misconfigured server as well).
> 
> As an aside: you start to see why this problem has never been fixed.
> Fixing the basic problem is easy, but if you start talking about some
> huge negotiation framework ... gaaah, it's too much.

Yes... when you start talking about large complex frameworks for ubik
configuration, you start talking about something that makes ubik
thousands of times more likely to break, when it _really_ needs to not
break. Simple runtime options for e.g. altering which site is the "best"
seem much much more favorable to me because of that.

However, with options to change the election algorithm, the other
potential failure mode is that you may get two sites that simultaneously
think that they are the coordinator. That is a failure mode we do not
have today (at least, not easily), and of course one needs to be very
careful that that doesn't happen. (Being able to change who gets the
extra half vote, for instance, can easily cause that.) But a new change
or feature where the only problem is "if you configure it wrong you
cannot achieve quorum"... I just fail to see that as a blocking issue.

-- 
Andrew Deason
adeason@sinenomine.net