[OpenAFS] Re: DB servers "quorum" and OpenAFS tools
Peter Grandi
pg@afs.list.sabi.co.UK
Thu, 23 Jan 2014 15:39:03 +0000
[ ... ]
>> At some point during this slow incremental plan there were 4
>> entries in both 'CellServDB's and the new one had not been
>> started up yet, and would not be for a couple days.
> Oh also, I'm not sure why you're adding the new machines to
> the CellServDB before the new server is up. You could bring up
> e.g. dbserver 4, and only after you're sure it's up and
> available, then add it to the client CellServDB. Then remove
> dbserver #3 from the client CellServDB, and then turn off
> dbserver #3.
As mentioned at greater length in a just sent message, to
minimize the number of DB daemons restarts/resets.
But even without that motivation, if I have a cell with 4 DB
servers, the fact that 1 is down should really have no or little
noticeable impact, or else the redundancy they provide is not
that worthwhile...
> You would need to keep the server-side CellServDB accurate on
> the dbservers in order for them to work,
Well, "accurate" perhaps is a bit too strong: there needs to be
enough listed to form a "quorum", and there should be at least
one that is common between the client 'CellServDB' and the
'server/CellServDB', and ideally all the 'server/CellServDB'
members should be also in the client 'CellServDB' (here I guess
everybody understands that 'CellServDB' means that file or
equivalent means in DNS etc.).
Because the crucial properties are:
* DB servers for the wrong cell name or without the key don't matter.
* DB servers outside the "quorum" for a cell name don't matter.
* All "quorum" members know each other and which of them is
the sync site.
and therefore I hope this happens:
* Each DB server knows which cell it belongs to and its key(s).
* Each DB server knows whether it is part of the "quorum", and
the list of "quorum" members.
* Each DB server that is part of the "quorum" knows which one
is the sync site.
Then whenever a DB server contacts or is contacted by another DB
servers it should:
* Check that the cell name is the same.
* Verify the cell is the same using the shared key.
* Ask the other server for a list of quorum members.
* Check whether the other server is in its "quorum" list:
- If missing, add to "quorum" list and trigger an election.
- If present, check the 'sync site' is the same:
o If not same, trigger an election.
> CellServDB files can be missing dbservers.
What clients (cache or tools) should probably do:
* Contact all 'CellServDB' servers quickly.
* For each or first DB server that replies check whether it
has the same cell name.
* If is the same cell name, try to use a token to get the
list of "quorum" member:
- If that failes, the DB server does not have the right key,
so skip it.
- If that succeeds:
o Choose a "quorum" member at random for a query.
o Choose the sync-site for an update.
> This won't work if a client needs the sync-site, and the
> sync-site is missing from the CellServDB, but in all other
> situations, that should work fine.
Then current client libraries could be improved, because any
quorum member could be asked for the address of the sync-site.