[OpenAFS] Re: Doubts about OpenAFS implementation in a company

Thu, 19 May 2011 10:59:33 -0500

On Thu, 19 May 2011 08:05:20 +0200
Stanisław Kamiński <stasheck.fora@gmail.com> wrote:

> Since Linux users have their homedirs on AFS, when AFS servers fail for 
> any reason, the only thing they can do is to reset their station and 
> enter emergency mode (ie. login with no own homedir, but allowing to 
> work), losing current work in progress. Again, I don't know if it'd be 
> enough to have fileserver 
> w/o DB server in given location. I guess I'll 
> go with switching abroad servers to "clone" mode (I didn't even know 
> something like this exists).

A fileserver without access to a dbserver (and thus no VLDB) would at
best let you access files in volumes you've already seen for up to 2
hours. And maybe not even that. So, yes, it certainly makes sense to
want to have a dbserver available in each region in that scenario.
dbserver clones are the way to do that.

To maybe explain a little bit why you don't want 6 or 7 non-"clone"
sites... voting recovery times and propagation delays get worse as you
add more sites. So, in the case of a significant outage, it can take
longer for the dbservers to become fully functional again, and any
change to the vldb or ptdb takes longer. You may not notice the
propagation delays so much, though, if you don't make a lot of vldb or
ptdb changes and have small databases.

But even rather large cells get by with 3 dbservers. The non-clone
dbservers are really just the dbservers that are eligible to become a
"master" site, where all of the changes go to, so you shouldn't need
many.

> As for relocating volumes - I'm afraid vos dump/restore is not an
> option, because user can't work in the meantime. I asked about number
> of cells because it seems normal to me that an organization has one
> cell, but I read in the book (very old, I know) that it might be a
> standard practice to create one or more cells per department.

Well, it's still probably possible to do while using the volume with
some messing about. Most of a "vos move" is just a dump/restore anyway,
along with changing a volume's vldb entry and some volume status.

I haven't tried, but I think it should be possible to dump/restore, take
the volume offline and then dump an incremental, remove the old volume
and change the mountpoint in /afs/. It's pretty much the equivalent of a
normal "vos move", except for changing the mountpoint; I'm not sure how
well that part will actually work, and of course won't work if you don't
know in advance where all of the relevant mountpoints are.

That's all very non-trivial, though. You (or someone) would need to
write additional tooling to get it to work, which you probably don't
want to do. I just wanted to say that it may be possible to accomplish.

-- 
Andrew Deason
adeason@sinenomine.net