[OpenAFS] Re: DB servers "quorum" and OpenAFS tools

Peter Grandi pg@afs.list.sabi.co.UK
Thu, 23 Jan 2014 15:39:03 +0000


[ ... ]

>> At some point during this slow incremental plan there were 4
>> entries in both 'CellServDB's and the new one had not been
>> started up yet, and would not be for a couple days.

> Oh also, I'm not sure why you're adding the new machines to
> the CellServDB before the new server is up. You could bring up
> e.g. dbserver 4, and only after you're sure it's up and
> available, then add it to the client CellServDB. Then remove
> dbserver #3 from the client CellServDB, and then turn off
> dbserver #3.

As mentioned at greater length in a just sent message, to
minimize the number of DB daemons restarts/resets.

But even without that motivation, if I have a cell with 4 DB
servers, the fact that 1 is down should really have no or little
noticeable impact, or else the redundancy they provide is not
that worthwhile...

> You would need to keep the server-side CellServDB accurate on
> the dbservers in order for them to work,

Well, "accurate" perhaps is a bit too strong: there needs to be
enough listed to form a "quorum", and there should be at least
one that is common between the client 'CellServDB' and the
'server/CellServDB', and ideally all the 'server/CellServDB'
members should be also in the client 'CellServDB' (here I guess
everybody understands that 'CellServDB' means that file or
equivalent means in DNS etc.).

Because the crucial properties are:

  * DB servers for the wrong cell name or without the key don't matter.
  * DB servers outside the "quorum" for a cell name don't matter.
  * All "quorum" members know each other and which of them is
   the sync site.

and therefore I hope this happens:

  * Each DB server knows which cell it belongs to and its key(s).
  * Each DB server knows whether it is part of the "quorum", and
    the list of "quorum" members.
  * Each DB server that is part of the "quorum" knows which one
    is the sync site.

Then whenever a DB server contacts or is contacted by another DB
servers it should:

  * Check that the cell name is the same.
  * Verify the cell is the same using the shared key.
  * Ask the other server for a list of quorum members.
  * Check whether the other server is in its "quorum" list:
    - If missing, add to "quorum" list and trigger an election.
    - If present, check the 'sync site' is the same:
      o If not same, trigger an election.

> CellServDB files can be missing dbservers.

What clients (cache or tools) should probably do:

  * Contact all 'CellServDB' servers quickly.
  * For each or first DB server that replies check whether it
    has the same cell name.
  * If is the same cell name, try to use a token to get the
    list of "quorum" member:
    - If that failes, the DB server does not have the right key,
      so skip it.
    - If that succeeds:
      o Choose a "quorum" member at random for a query.
      o Choose the sync-site for an update. 

> This won't work if a client needs the sync-site, and the
> sync-site is missing from the CellServDB, but in all other
> situations, that should work fine.

Then current client libraries could be improved, because any
quorum member could be asked for the address of the sync-site.