[OpenAFS] Re: about failover - 2 servers (one "master" one replicas) - a bit long

Mon, 22 Mar 2010 09:28:09 -0500

On Mon, 22 Mar 2010 14:01:00 +0000
Vladimir Konrad <v.konrad@lse.ac.uk> wrote:

> > > Ideally, it should work even if A goes down and the read-only
> > > volumes are converted to read/write.
> > 
> > OpenAFS is not designed for automatic failover.
> 
> Cheers, I forgot to say _by hand_.

You can do this with 'vos convertROtoRW', but it's intended to be more
of a tool for disaster recovery (when you've permanently lost the RW,
and all you have are ROs). Not generally for keeping up availability
while a server is temporarily down.

Note that if A goes down, you convertROtoRW on B, and A comes back up,
you'll now have 2 copies of the RW. The one on B will be the one used,
but A has another copy that may contain data you want. This can get
rather confusing if you try to sync the VLDB with the list of volumes
that are on each server.

> Out of curiosity - was the automatic fail-over contemplated (I know
> this kind of thing is not usually straight forward)?

Automatic failover has been done using multiple servers sharing the same
backend storage; I don't think anyone's done it with separate storage,
but we're not stopping you from doing so. You could in theory do
something like that with some other HA software, and writing some
scripts to issue 'vos' commands to do the conversions.

But it's usually a lot easier if you can just treat RO volumes as
high-availability, and RW volumes not.

-- 
Andrew Deason
adeason@sinenomine.net