[OpenAFS] Re: backup strategy

Andrew Deason adeason@sinenomine.net
Mon, 10 Nov 2014 18:19:31 -0600


On Mon, 10 Nov 2014 11:53:33 -0800
Russell Button <russell@telmate.com> wrote:

> I get the impression that AFS is this amorphous cloud of data storage.
> So when you backup stuff, it's not as if it's organized by machine and
> file system.

It's not really much different for AFS than most other things. Your
files are stored on various fileservers and clients access those
fileservers over the network to access it. There is some AFS-specific
data that is stored in addition to the contents of your files themselves
(permissions, location data, etc), so if you want to back up that stuff,
you'd need a backup tool that is AFS-aware. But that's the same for a
lot of other things; e.g. software that integrates with backing up
VMWare VMs needs to be aware of and integrate with VMWare.

You _can_ just use a "regular" backup tool to backup AFS, but it's
usually not recommended. There are a couple of ways you could do that,
but they have serious downsides:

1. Point the backup tool at /afs/cell/, like you do for any
AFS-accessing application, and backup each individual file by just
reading it out of AFS. The problems with this is that this approach is
slow and causes a lot of unnecessary load on the fileservers (especially
when data has not changed), and it means you lose AFS-specific metadata,
such as AFS directory permissions and AFS mountpoint data. So, it would
be pretty annoying to put your cell back together after catastrophic
data loss, if all you had is backup data in this form; but you would
still have the data.

2. Point the backup tool at the /vicep* directories on the fileservers
(which is where the fileservers store data for AFS files). This would
ensure that you backup all of the AFS metadata, and is probably more
efficient than the approach in '1.'. However, the files in those
directories are stored in a very particular format and structure, and
you'd need to know how to get the files you're used to back out; and you
are not guaranteed to get a consistent snapshot of the AFS data when you
do this. This approach is similar to backing up data on a local
filesystem by just 'dd'ing a raw image from e.g. /dev/sda4 (except
incrementals would be a bit easier).

> With this much data, spread out over 3 geographically distant data
> centers, it's not as if you can do a full dump on the 1st of the month
> and then do daily incrementals for the month, and then start over
> again next month.

Well, you could do that, but yeah, certainly as the size of your data
set increases this gets pretty painful. 

However, I thought you had your data organized into AFS volumes that
effectively never changed after a certain point (that is, at Telmate;
Timothy Balcer has posted here before). I may not be remembering that
correctly; it's been a while. If that is correct, though, then you don't
need to worry about incrementals and such for a large amount of data;
you just need to back it up once and then never touch it again (except
perhaps to verify that it indeed never changes).

> Does anyone Out There have a similar problem, and if so, what strategy
> did you use?

Others can share their own experiences, but I'll at least mention some
of the options.

Teradactyl's TiBS:
<http://www.teradactyl.com/backup-solutions/backup-platforms/openafs-backup.html>.
IIRC Teradactyl likes to trumpet their "synthetic" full dumps, which is
a feature where they use existing incremental and full dumps to generate
a new "full" dump every so often. This addresses what you were talking
about before, because it avoids needing to retain e.g. hundreds of daily
incrementals, but also avoids needing to periodically dump all data at
once.

Stephen Joyce's BackupAFS:
<http://user.physics.unc.edu/~stephen/BackupAFS/>. I'm not sure how many
sites use this, but from what I remember at least Stephen Joyce says it
works well :)

The backup system that comes with OpenAFS (I sometimes refer to this as
the "native" OpenAFS backup system). This is a bit wonky to set up and
use, but it does still work, and some places use it successfully.
Information about it can be found in chapter 6 of the admin guide
"Configuring the AFS Backup System":
<http://docs.openafs.org/AdminGuide/HDRWQ248.html>. That documentation
is very old, but this backup system has been almost unchanged for
probably over a decade, so it's still probably accurate. If the book you
mentioned talks about a backup system without giving it a specific name,
I assume it's talking about this one.

And finally, the option that I think quite a lot of sites use, is to
just develop your own scripts that run "vos dump" on everything and
store the dump blobs somewhere. But you need at least a bit of knowledge
about AFS in order to do that, especially if you are to handle
incremental dumps and such.

There are/were also ways to integrate AFS into some other backup tools,
like TSM, AMANDA, Bacula, and some other commercial ones. I would not
recommend any of the integration tools except for the TSM ones; but I
assume you wouldn't want to run TSM just for backing up AFS.


It could also help when thinking about this to nail down a few more of
your requirements (whether or not you discuss that here, but at least
for yourself). You've described a bit about the data you're backing up
(except for how it's organized within AFS, but you may not know that),
but I don't think you've mentioned the restore requirements. For
example, if you want end-users to be able to restore data themselves, or
if restoring is an administrative operation. Also, whether you need to
restore data based on /afs file path, or if restoring entire AFS volumes
is okay (I'm not aware of any publicly-available backup solution that
works per-file... except maybe the TSM integration? And the '1.'
approach listed way up top). Do you have any existing backups framework
that you might just want AFS to integrate into?

I'm also not sure if you have an idea of whether you want to be backing
up to tape, disk, or some other media; or maybe you're not sure and are
asking for advice on that as well? :)

-- 
Andrew Deason
adeason@sinenomine.net