[OpenAFS] Re: about failover - 2 servers (one "master" one replicas) - a bit long

Andrew Deason adeason@sinenomine.net
Mon, 22 Mar 2010 10:45:21 -0500


On Mon, 22 Mar 2010 15:00:55 +0000
Vladimir Konrad <v.konrad@lse.ac.uk> wrote:

> > But it's usually a lot easier if you can just treat RO volumes as
> > high-availability, and RW volumes not.
> 
> Makes sense, it looks having multiple RW volumes would not scale that
> well - writes would have to go to each volume, + synchronisation would
> get messy I guess...

I think the hardest part is conflict resolution, but I'm not too
familiar with it. Coda is able to do RW replication, but as I recall can
require manual conflict resolution (2 writes happened at the same time,
and you must manually specify which one wins).

I believe there have been at least one or two attempts to do this
in-band in AFS (you can read about one proposed way of doing it at
<http://www.student.nada.kth.se/~noora/exjobb/filer.html>). But nobody's
been able to do it yet; it is a hard problem to solve. It's also one of
the suggested OpenAFS GSOC projects: <http://www.openafs.org/gsoc.html>.

> Do I understand it correctly (observation), a read-only replica placed
> on the same partition as the read-write volume does not "cost" much in
> terms of disc-space?

Yes, as long as your RW does not differ much from your RO. That is one
reason why it's almost always a good idea to have an RO on the same
server/partition as the RW, if you have any ROs for that RW.

> I have released few replicas and the disc usage did not go up. Is it
> along the principle of LVM snapshots?

Sort of, but arguably not as good. With LVM snapshots and similar
systems, you get charged space for each block that is changed. With
OpenAFS volume clones, you get charged for each file (vnode) that is
changed.

-- 
Andrew Deason
adeason@sinenomine.net