[OpenAFS] TSM client for OpenAFS

James E. Blair jeblair@berkeley.edu
Thu, 02 Apr 2009 17:37:39 -0700


Anders Magnusson <ragge@ltu.se> writes:

> Hi,
>
> we are currently deploying OpenAFS here at the university and also
> are stuck with TSM for backups.  Currently there doesn't seems to
> exist any good AFS backup client for TSM, and just storing volume
> dumps is not too appealing, both due to backup storage space and
> simpleness in restoring single files.
>
> So I have written a client that uses the TSM API for backup.  It
> reads the data directly out of a volume and store all files in TSM
> as objects, while preserving ACLs, mountpoints etc.  Doing it this
> way will let AFS backups use the policies for objects, and also
> restores can be performed via dsmc if necessary.

We're in a similar situation here, and considered writing the program
that you did, though instead we went a slightly different route.  Our
service is in a pilot phase, so we have some flexibility for
experimentation.

We wrote a script that sets up a TSM environment and calls dsmc to
perform the backups.  The script takes care of looking for mountpoints
to both record them for future restores, and add them to an exclude
list to prevent unwanted recursion.  It also writes directory ACLs to
a metadata file.

We backup each volume to a filespace, and the metadata (ACLs, mounts)
of each volume to another.  When we dump the metadata, we check the
hash of the file against the last time we generated it, and skip
backing it up if it hasn't changed.

We actually backup the .backup volumes (ie, snapshots), so that the
data are consistent during the backup.  That way an errant recursive
mountpoint can't sneak in and ruin our day.

We have about 400 volumes with 1.5 terabytes, and it takes us about
2.5 hours to work through that.  It's not relevant to AFS, but just in
terms of how TSM scales, our email system has 10 filespaces with 1TB
and 20 million objects each, with each backup taking about 10 hours on
average.

> There are a few nits, though, that I haven't found a good way to
> handle.  Any suggestions
> are welcome :-)
>
> First is the storage of AFS data inside of TSM.  TSM has three
> identifiers for an object:
> - filespace (typically mount point)
> - High-level name (path inside mount point)
> - Low-level name (filename)
> Ideally the filespace should be the volume name, but TSM gets _really_
> slow if there are
> too many (a few hundred) filespaces.  Currently I just give it the
> cellname, and stores
> the volume name in the HL name (like /volume/path-in-volume).  Other ideas?

That's interesting, is that limit per-node, or per-TSM-server?

> So, if someone beside us need to use TSM for AFS and are interested
> in using this client, feel free to give
> comments/ideas/whatever... :-)

This is very interesting.  Our script is doing well for the moment,
but a solid API client may be preferable in the long run.

I thought I remembered something in the API docs indicating that you
may not be able to use dsmc to restore something that was stored using
the API, but you mention that you could use dsmc for restores if
necessary.  Have you tried that, and were there any issues?

Thanks,

James E. Blair
UC Berkeley - IST