[OpenAFS] AFS DB server upgrade advice

Jeff White jaw171@pitt.edu
Thu, 14 Feb 2013 14:29:25 -0500

We just moved our DB servers from Solaris 8 to RHEL6 though we were 
running 1.2 AFS code.  We reused the same IPs so here's what we did:

1. Stop the AFS processes on the old DB server with the highest IP, wait 
a few minutes
2. Repeat on the next highest IP
3. Repeat on the final DB server
4. Grab the DB files, pull the network cables, give the new systems 
those IPs
5. Put the DB files on the new server with the lowest IP
6. Test with a client only pointing to that one server
7. Start the AFS processes on the other two servers with higher IPs and 
wait for ubik to move the DBs over
8. Test some more

It was a painless process though it took vlserver's DB a half hour or so 
to make it to the other servers.  I guess ubik doesn't like propagating 
it while it is being changed on the master.

Yes, our method was a full outage but it's the safest option.  I'm not 
sure about your cell where you have to change IPs, I've never done that 
to DB servers.  In any case I would not mix DB server versions in the 
same cell.

Jeff White - GNU+Linux Systems Administrator
University of Pittsburgh - CSSD

On 02/14/2013 02:10 PM, Jack Neely wrote:
> Greetings,
> I'm planning out an upgrade path for our AFS DB servers.  A lot of
> changes need to be made, so I wanted to see if folks know of a
> safer/better way to do the below.
> We have 3 cells, each of which as 3 DB servers running Solaris 8 and
> OpenAFS 1.4.7.  We would like to upgrade to RHEL 6 based servers running
> OpenAFS 1.6.1.  Two DB servers in the EOS cell need new IPs, the other
> IPs will stay the same.  99% of our clients use --afsdb, except the few
> remaining Solaris 8 machines.
> Our plan for this is below.  One server at a time, we start with the
> highest IP to do the sync master last.
> 1 Shutdown the AFS server processes
> 2 Take Backups
> 3 Build/Install 1.6.1 on the server
> 4 Recreate an empty db/ directory unless we are upgrading the sync
>    master -- it gets a pre-populated db and config files (upclient
>    master)
> 5 Start the AFS server processes
> 6 Check for sanity
> Once this process has produced a stable DB environment we had planned on
> running the process again where step 3 is replaced by shutdown solaris
> machine and replace it with a RHEL box.
> In the EOS cell 2 servers must change IPs.  One of these new servers
> will end up being the sync master as it will have the lowest IP.  (The
> current sync master gets to keep its IP.)  So we were thinking about a
> similar upgrade path.
> 1 Upgrade all solaris servers to openafs 1.6.1
> 2 Create two new RHEL/1.6.1 DB servers on what will be the new IPs and
>    let one become the new sync master.  Update DNS records and CellServDB
>    files on the other EOS DB servers as appropriate.  5 DB servers at
>    this point.
> 3 Check for sanity.
> 4 Upgrade via above the old sync master to RHEL
> 5 Remove the solaris DBs from afsdb DNS records and CellServDBs on the
>    EOS DB servers.
> 6 Monitor existing solaris servers to see usage drop off
> 7 Shutdown the 2 existing solaris servers and surplus
> Any recommendations for making this process any smoother?
> Jack