[OpenAFS] TSM client for OpenAFS

Harald Barth haba@kth.se
Thu, 02 Apr 2009 18:08:49 +0200 (CEST)


> we are currently deploying OpenAFS here at the university and also are stuck with TSM
> for backups.

There could be worse things. PDC uses TSM and we use homwgrown logic
to do full and incremental dumps. These then are piped into a program
that uses the API. The only problem is that we do not have a tool for
the users to request restores. But this has not been a too pressing
burden yet.

> Currently there doesn't seems to exist any good AFS backup client for TSM,

The company with the dinasaur? name (teradactyl.com) makes one, but
last when I applied their pricing to my department size it was not so
attractive any more. I addition we need TSM anyway and for TiBS I
would have needed a seperate infrastructure in front of the tape
library.

> and just storing volume dumps is not too appealing, both due to backup storage space and
> simpleness in restoring single files.

Why is the space bigger? Don't you do incremental dumps?

> So I have written a client that uses the TSM API for backup.  

I suspected some ongoing work for AFS backups (considering your
earlier questions).

> It reads the data directly out
> of a volume and store all files in TSM as objects, while preserving ACLs, mountpoints etc.
> Doing it this way will let AFS backups use the policies for objects, and also restores can
> be performed via dsmc if necessary.

Nice.

> There are a few nits, though, that I haven't found a good way to handle.  Any suggestions
> are welcome :-)
> 
> First is the storage of AFS data inside of TSM.  TSM has three identifiers for an object:
> - filespace (typically mount point)
> - High-level name (path inside mount point)
> - Low-level name (filename)
> Ideally the filespace should be the volume name, but TSM gets _really_ slow if there are
> too many (a few hundred) filespaces.  Currently I just give it the cellname, and stores
> the volume name in the HL name (like /volume/path-in-volume).  Other ideas?

Good that you have tested the filespace == volume thing. But the
question "what happens when you have some hundred filespaces" could be
a relevant one to the TSM 6 devel team. (TSM 6 is a major redesign I
was told).

Another numbers question: We have currently at least 98434720 files in
our cell. I don't know how TSM would react with that in one filespace
either.

> Second is the storage of attributes and ACLs.  There is a 255-byte space available for
> storing object attributes connected to each object.  This is not enough to store the ACLs
> as clear-text, so I have to do pts lookups to translate them to their internal numbers
> and store as such in the attribute block.  Any better ideas of how to do this?

Another question is when to backup the contents of a directory and
what to store along with the directory and what along with the file.
Plan for future file ACLs?

Harald.