[OpenAFS] AFS design question: implementing AFS over a highly-distributed, low-bandwidth network

Steven Jenkins steven.jenkins@gmail.com
Mon, 19 Jan 2009 22:28:25 -0500

On Mon, Jan 19, 2009 at 9:22 PM, Chaz Chandler <clc31@inbox.com> wrote:
> So, in your considered opinion, would it be wiser to go with one cell, put up with potential quorum snafus, ensure the clients set their preferred server to a local one, and move R/W volumes when users move locations?  Or to go with multiple cells, perhaps one as master, and resolving ambiguities on a per-volume basis depending on how that volume is intended to be used?

I see it as follows:

Option 1: single cell; leverage vos move, with some tools to handle
volume replication, but you'll probably be able to leverage existing
OpenAFS commands without needing a lot of further automation.

- simple in the read-only case
- will work, even in the case where remote accesses are done
- will require some development, but  not as much as the other case
- disconnected operations may make this a nice option down the road
- there may be something you could do with vos shadow to help get
copies of data into each site

- quorum will probably be a problem; long enough outage can mean that
none of your clients can function.  That's a pretty bad worst-case
scenario (i.e., this configuration does not degrade gracefully)

Option 2: each site is a separate cell; use incremental vos dump + vos
restore to move volumes from cell to cell (i.e., site to site), with
regular dumping+restoring taking place to ensure no site is more than
a day out of date, plus some changes to higher level (e.g.,
'container' volumes) and some recovery tools to help in an outage.

- no ubik quorum problems
- loss of remote site won't impact local accesses (but will impact
remote accesses; however, the failure case will be to access something
at most 1 day old).
- each site will have a copy of the data, even if it is slightly out
of date, so a local operation could be done manually to help 'get back
in business'

- more complicated
- will require more development than the other option
- messing up a container volume and pushing it out will trash everyone
(implication: don't mess up a container volume, do checking on
container volumes before pushing  them out.)

I would first test to see how having a single cell works with your
network (i.e., measure network utilitization) -- a simpler environment
would be great if it will work.  But I would plan to go with multiple
cells and use incremental dump and restore to move volumes around.

But I also have experience with that type of operation, and I've
wanted an excuse to write something open source that handles some of
the common cases Morgan Stanley's system does, so I'm very biased in
that decision.  As someone new to OpenAFS, that may be more complexity
than you want to tackle.

Steven Jenkins
End Point Corporation