[OpenAFS] TSM client for OpenAFS

Mattias Amnefelt mattiasa@it.su.se
Thu, 02 Apr 2009 19:05:43 +0200


Harald Barth skrev:
>> we are currently deploying OpenAFS here at the university and also are stuck with TSM
>> for backups.
>>     
>
> There could be worse things. PDC uses TSM and we use homwgrown logic
> to do full and incremental dumps. These then are piped into a program
> that uses the API. The only problem is that we do not have a tool for
> the users to request restores. But this has not been a too pressing
> burden yet.
>
>   
At Stockholm University we use a similar scheme to what PDC uses. The 
main difference is that we store dumps in a staging area and wait until 
we have enough dumps to make it worth archiving them in TSM (we use TSM 
archive, not backup). This limit currently is 10G.

For a part of the file space we use the normal dsmc client since that 
part of our space is very well controled and we don't have the need to 
restore acl:s or other afs metadata.

>
>> So I have written a client that uses the TSM API for backup.  
>>     
>
> I suspected some ongoing work for AFS backups (considering your
> earlier questions).
>
>   
I've been thinking about how to do this myself. My latest idea was to 
modify the arla client and have it write data to TSM using the API. It 
hasn't progressed much further than an initial idea though.

>> First is the storage of AFS data inside of TSM.  TSM has three identifiers for an object:
>> - filespace (typically mount point)
>> - High-level name (path inside mount point)
>> - Low-level name (filename)
>> Ideally the filespace should be the volume name, but TSM gets _really_ slow if there are
>> too many (a few hundred) filespaces.  Currently I just give it the cellname, and stores
>> the volume name in the HL name (like /volume/path-in-volume).  Other ideas?
>>     
> There are a few nits, though, that I haven't found a good way to 
> handle. Any suggestions

My idea was to use the single filespace and have 
HL=/volume/path-in-volume too. I'm almost certain administrators who do 
QUERY FILESPACE don't want several 100k filespaces listed.

>> Second is the storage of attributes and ACLs.  There is a 255-byte space available for
>> storing object attributes connected to each object.  This is not enough to store the ACLs
>> as clear-text, so I have to do pts lookups to translate them to their internal numbers
>> and store as such in the attribute block.  Any better ideas of how to do this?
>>     
>
>   
The volume dumps which we store contain ACLs in their binary formats and 
also requires pts to be useful, so you wouldn't be much worse of than we 
are :)

Note however, that an AFS acl can be as large as 1024 bytes so you 
cannot be certain to store it in a 255 bytes space.

The idea I had was to store one file in the root of the volume for the 
volume metadata and one per directory for the file metadata. I'm not 
sure whether to use a binary format or a text format (XML?) though.

/mattiasa