[OpenAFS] Procedure for changing database server IP addresses

Jason Edgecombe geekrampaging@gmail.com
Tue, 17 Jan 2017 23:15:49 -0500

On 01/17/2017 03:45 PM, Stephen Joyce wrote:
> I know the current best-practice for changing the IP addresses of AFS 
> database servers is don't do it.
> But assuming that I want/need to change IPs and have available 
> hardware, is the use of clone dbservers the preferred method? I can 
> tolerate short service interruptions of up to a few minutes as long as 
> they're planned for low-utilization times.
> Initial condition is 3 dbservers ("OLD") located via AFSDB & SRV, 
> running 1.6.x. Desired final condition is 3 dbservers ("NEW") with 
> different IP addresses, also running 1.6.x (for now).
> I'm roughing out a procedure, but my current thinking involves..
>  add 3 NEW dbservers as r/o clones (restarting db procs)
>  modify DNS to show all 6 IPs.
>  'fs newcell' or restart all afsd's (including on servers)
>  swap clone/non-clone roles so that NEW dbservers are r/w and OLD 
> dbservers are r/o clones (restarting db procs). At this point, sync 
> must be a non-clone, r/w "NEW" server. Verify with udebug. Any client 
> afsd's not restarted/newcell'ed won't be able to make pt/vl changes.
>  modify DNS to show only 3 NEW IPs
>  'fs newcell' or restart of all afsd's (including on servers)
>  remove 3 OLD dbservers which must be r/o clones (restarting db 
> procs). Any client afsd's not restarted/newcell'ed won't be able to 
> query pt/vlservers.
> Because it could take some time to restart/newcell all clients, I'm 
> thinking of doing the clone addition/dns steps then waiting some time 
> (week+) before doing the role swap and second dns change. Then waiting 
> another period of time (week+) before doing the last removal.
> I'm assuming that I can use -auditlog (or even a packet sniffer) to 
> see what clients might still be using the OLD dbservers prior to the 
> final decommissioning.
> Seems a bit too simple. What am I missing? 
I think that you're overthinking it. I don't think that there is a need 
to use DB clones. You should be able to simply run 6 DB servers with the 
caveats that the DB servers don't see insane amounts of traffic or latency.

I suggest the following:
1. add 3 new servers as full DB servers (not clones).
2. Update CellServDB on all servers to have all 6 IPs. Restart AFS 
server processes
2b. Use udebug to ensure that all sites are current after adding new IPs
3. Change DNS/CellServer DB to to only show 3 NEW IPs.
4. Restart all clients to use only new IPs
5. Wait a week or so to monitor clients.
5. Remove old IPs from CellSERVDB on servers,
6. Restart all AFS server processes.
7. Shut down old AFS DB/cell servers.