[OpenAFS-devel] Ubik voting idea
Ken Hornstein
kenh@cmf.nrl.navy.mil
Wed, 03 Oct 2001 15:21:04 -0400
>>yes, but wouldn't this need to be on all machines in case the designator
>>machine was down, and someone else got voted the designator? looks like
>>we still have the same problem.
>
>The problem being addressed is that if two db servers have different
>voting preferences they might vote in a way that causes deadlock, at
>worst, or hairy coding problems, at best. Yet there is a desire at
>some sites that the lowest IP address DB server not be the sync-site.
I have a modest proposal.
The few cases in AFS where the ordering is actually significant, as I
understand it, are:
vote.c:
in uvote_ShouldIRun()
if (ntohl((afs_uint32) lastYesHost) < ntohl((afs_uint32) ubik_host[0]))
return 0; /* if someone is valid and better than us, don't run */
in SVOTE_Beacon()
if (ntohl((afs_uint32) otherHost) <= ntohl((afs_uint32) lowestHost)
|| lowestTime + BIGTIME < now) {
lowestTime = now;
lowestHost = otherHost;
}
[...]
if (ntohl((afs_uint32) ubik_host[0]) <= ntohl((afs_uint32) lowestHost)
|| lowestTime + BIGTIME < now) {
lowestTime = now;
lowestHost = ubik_host[0];
}
beacon.c:
in ubeacon_InitServerList()
if (ntohl((afs_uint32) servAddr) < (afs_uint32) magicHost) {
magicHost = ntohl(servAddr);
magicServer = ts;
}
My work with Nubik has shown me that you can very easily rip out the IP
address as the comparison and substitute whatever you want, such as an
arbitrary ordering. So something perhaps a lot simpler would be to
create a table of server IP addresses that included their ordering, so
for example the first instance above in vote.c would be changed to:
if (ubik_order(lastYesHost) < ubik_order(ubik_host[0]))
Like you said, if the same list wasn't on each server, the odds of weird
voting (that might not converge) very likely. I'm not sure there's a
good way around that ... and for the people that really _need_ it, I
think that it would be acceptable. Use the current IP address ordering
as the default in the absence of any configuration information would
make the "out of the box" configuration simple enough for most people.
I think that would simpler than a designator scheme, because as I understood
your proposal there's no way currently for the designator to notify other
servers who's the real sync site, so unless I'm missing something you'd need
to change the Ubik protocol to include that information in the broadcast by
the designator.
--Ken