[OpenAFS] user home directory replication

Simon Wilkinson sxw@inf.ed.ac.uk
Thu, 15 Jul 2010 13:02:34 +0100

On 15 Jul 2010, at 03:16, Todd Lewis wrote:

> On 07/14/2010 01:43 PM, Jonathan Nilsson wrote:
>> I would like to replicate home directories (and other AFS volumes =
>> are primarily accessed read/write) for the purpose of faster disaster
>> recovery in certain common cases, such as local hardware failure on =
>> AFS File Server.
> Others have offered good advice based on their years of experience =
that should be helpful in making this work. By "this" I mean attempting =
to take advantage of volume replication in a disaster recovery strategy =
as a proxy for high availability.
> My personal opinion here: Don't do this.

I think Todd is perhaps being over cautious here. We (School of =
Informatics at Edinburgh University) have been running AFS in this =
configuration ever since we first deployed it in production about 5 =
years ago. It works very well for us, and we started out as novice AFS =
users at the start of that deployment process.

There are a few important things to consider before you go down this =

*) The problem you are trying to solve. I think you're pretty clear on =
this, but just to restate. Using read-only replicas is a perfect =
disaster recovery solution. However, that manual recovery is a human =
process - you're not going to gain high availability by doing so.

*) Promoting a volume from read-only to read-write can require =
intervention on clients which have recently accessed that volume =
(running fs checkv)

*) You really need management tools to keep all of your various volume =
copies in check. Our approach is that we have read-write partitions and =
read-only partitions, so a given read-write partition has a =
corresponding read-only partition on a geographically remote fileserver. =
Volume creation, and moves, are done so as to preserve these =
relationships for all volumes (we also, obviously, have the local r/o to =
speed up replication)

*) You need to consider whether you're going to offer backup volumes =
too. We use backup volumes to offer a 'yesterday' snapshot =
functionality, and to support our tape backup system.

I think that's all. For what it's worth, the correct functioning of all =
of this is pretty essential to us, and I'm a pretty active OpenAFS =
contributor, so it's not likely to break any time soon.