[OpenAFS] Re: DB servers "quorum" and OpenAFS tools

Andrew Deason adeason@sinenomine.net
Thu, 23 Jan 2014 14:32:58 -0600


On Thu, 23 Jan 2014 15:39:03 +0000
pg@afs.list.sabi.co.UK (Peter Grandi) wrote:

> > Oh also, I'm not sure why you're adding the new machines to
> > the CellServDB before the new server is up. You could bring up
> > e.g. dbserver 4, and only after you're sure it's up and
> > available, then add it to the client CellServDB. Then remove
> > dbserver #3 from the client CellServDB, and then turn off
> > dbserver #3.
> 
> As mentioned at greater length in a just sent message, to minimize the
> number of DB daemons restarts/resets.

I thought you were already migrating one server at a time, so my
suggestion doesn't impact the number of dbserver restarts. In fact, what
I was suggesting doesn't really touch the servers at all; the only
difference I was suggesting was changing when you modify the client-side
CellServDB relative to when you perform the dbserver migrations.

> But even without that motivation, if I have a cell with 4 DB servers,
> the fact that 1 is down should really have no or little noticeable
> impact, or else the redundancy they provide is not that worthwhile...

I think it's quite worthwhile to have the database be available yet
"slow", as opposed to not being available at all.

But yes, I believe there is at least 1 way in which the relevant code
might be improved. In the meantime, there are existing procedures with
the existing code that existing sites perform to mostly avoid the
problems you are seeing. It's up to you if you want to use them.

-- 
Andrew Deason
adeason@sinenomine.net