[OpenAFS] Upgrade plan - any gotchas?

Christopher D. Clausen cclausen@acm.org
Mon, 12 Dec 2005 12:29:22 -0600


Steve Gaarder wrote:
> Here's my plan for upgrading both OS and OpenAFS on my primary AFS
> server. Please let me know if you see any potential problems,
> gotchas, etc.

What do you mean by "Primary"?  Do you only have one AFS DB server for 
your cell?  Most cells run at least 3 AFS DB servers (just look through 
the CellServDB file.)

> These items I have already done:
>
> 1.  Set up a second VLDB and file server.  Change all the CellServDB
> files to include both servers.  (Authentication is via Krb5; neither
> AFS server is a KDC)

When you say "second," what do you mean?

You probably want at least three VLDB, PTS, (and possibly BackupDB, if 
you use that) servers total.  (So that two are up at any given time.) 
Ubik (syncronization protocol that AFS uses) grants an extra vote to the 
server with the lowest IP address, and if the server you take down has 
the lowest, you might not reach quorum and bad things can happen.

> 2.  Move all non-replicated volumes to the second server.  Have
> replicas of the others on both servers.

I'd say to leave the non-primary servers up for a day or two to make 
sure everything is synced and working before you take down the primary.

> Here is what I plan to do:
>
> 3.  Shut down the primary server.  I can do this during regular hours
> because the secondary server will carry all the load.

This should be fine, assuming you don't loose quorum among the DB 
servers.

I upgraded my cell to OpenAFS 1.4.0 by taking down a single server at a 
time, having previously vos moved the data, doing the upgrade, patching 
the systems, rebooting (just to make sure that the services start on 
reboot) and enabling the AFS server processes.

> 4.  Install the new OS (RHEL 4) on a new partition.  Install OpenAFS
> 1.4.0 but don't start it.
>
> 5.  Copy /usr/afs/db, /usr/afs/etc/, and /usr/afs/local from the old
> system partition to the new one. Mount /vicepa same as on the old
> system.

You should NOT copy /usr/afs/db.  These DBs will auto replicate from the 
other server and there is no need to pre-populate that directory.  In 
fact, doing so may cause problems.  And you can have all kinds of issues 
if you copy the sysid file from another server (this might be better 
now, but in general copying unique identifiers is NOT a good idea.) 
Also be aware that different servers may have different NetRestrict or 
NetAllow files and you don't want to copy them.

There are details about most of these things on the wiki, which is 
unfortunately still down.

-----

Note that this is based upon what I read in various on-line sources 
several years ago when I was planning our AFS cell.  Things may have 
changed since then and I assume that somone will correct anything that 
is totally wrong.

I'm sure there are several people willing to answer questions in real 
time on #openafs on the freenode IRC network, if you like that sort of 
thing.

<<CDC
-- 
Christopher D. Clausen
ACM@UIUC SysAdmin