[OpenAFS] Procedure for adding new DBs
Jeffrey Hutzelman
jhutz@cmu.edu
Wed, 08 Feb 2006 17:59:30 -0500
On Thursday, February 09, 2006 10:45:27 AM +1300 Matthew Cocker
<matt@cs.auckland.ac.nz> wrote:
> Hi
>
> Our AFS cell has recently expanded into a university wide storage
> service. As part of this expansion I need to add some new DB servers
> that will have lower ip addresses than the existing DBs.
>
> Would the following work
>
> i) bring up new dbs but do not restart db processes on other DBs so
> existing coordinator maintains role (will this work?)
No. All dbservers must have the same server list, and you must not start
ptserver or vlserver processes on machines not listed in that list. It
doesn't matter what the clients think, but if the servers do not all agree
you will have election problems, and depending on the way in which they
disagree, bad things could happen.
If you are going to add a new dbserver, you should distribute the new
server-side CellServDB to all of your existing dbservers first, and make
sure the ptserver and vlserver processes have been restarted to pick up the
change. I recommend restarting one server at a time, and waiting for it to
begin voting again, so as to avoid losing quorum, but this is not required.
Once you have updated all of your existing servers, you can start the new
dbserver, and it will join the quorum, obtain an updated database, and
begin providing service. Note that this will not force the new server to
become coordinator -- that only happens if the active coordinator fails
(which is fine; you don't really care which server is coordinator).
If you have multiple new dbservers, repeat the process more than once. You
can add multiple new servers at once, but be aware that more than half of
the servers listed in the configuration must be up in order to elect a
coordinator, so if you have three existing servers and are adding three new
ones, doing them all at once might not be such a great idea.
Note that the configuration of fileservers and clients is more or less
irrelevant, except that any client whose configuration does not include the
_current_ coordinator will be unable to make changes. To avoid this, you
can distribute configuration to clients listing all of the new servers
before beginning any updates (client configurations don't affect elections,
so you can add as many servers to client configuration at once as you
want). If you are removing servers, leave them in the client config until
you've done all the changes, then distribute another client config change.
> Some questions,
>
> i) once I alter all the servers via bos addhost command do I need to
> restart the fs/db processes to get the servers to use the new settings?
You need to restart the ptserver and vlserver, and also the kaserver and
buserver, if you are using those, in order to pick up the change. Note
that it is necessary for the _running_ configuration on all dbservers to
agree for elections to work correctly. However, it is safe to have a
server that has only been added/removed on some machines, provided that
server is not actually up. Note that for elections to succeed, the number
of partially-added servers must not be so high that the servers which are
actually up are not sufficient to satisfy the one-more-than-half rule. For
example, if you have three servers and are adding a fourth, it is safe to
add the fourth server to the server-side CellServDB and restart the
existing servers one at a time, provided you do not bring up the new server
until you've updated all the existing ones.
Once you have added all of your new dbservers, you will want to get around
to restarting the fileservers, so they know about the changes. Exactly
when you do this is not critical, as long as you don't inadvertently retire
all of the dbservers that a fileserver knows about. Note that the
fileserver registers itself in the vldb on startup, and this process will
fail if the current coordinator is not listed in that fileserver's
CellServDB. Once startup is complete, the fileserver does not perform any
operations which require talking to the coordinator.
> ii) can the server processes use DNS to get database servers like the
> client can?
It probably can, if the server-side CellServDB is empty, but this is not a
good idea. The election process depends on all dbservers agreeing on the
set of available servers. Using the DNS as the source for this data makes
it too easy to have inconsistencies, and too difficult to control exactly
when each server picks up a change.
> iii) Is fs newcell sufficient to get linux client to use the new servers
Yes. Note that fs newcell _replaces_ the client's idea of the set of
dbservers for that cell, so you will need to list all of the existing
servers in addition to the one you are adding. A restart is not required.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA