[OpenAFS] a noobs question and problems on a new cell

Christopher D. Clausen cclausen@acm.org
Mon, 7 May 2007 02:29:41 -0500


Adnoh <adnoh@users.sourceforge.net> wrote:
> By "Backup" I mean the typical type of backup - one snapshot of the
> data at the weekend and some
> incremental backups every day. i don't think this is posible with
> backup volumes - so i thought about volume dumps.

Backup volumes must reside on the same server and partition as the RW 
volume.  You could use backup volumes, dump them, and copy the dump 
files to a central location, even into a different AFS volume on another 
server.  That isn't the most efficient backup program though.

> i don't want to have all the volumes in our headquarter. so every
> time a user openes his word-doc or similar it would be completly
> transfered over our VPN - and I can hear the people crying "our
> fileservers are too slow !" so seperate fileservers in every district
> would be a good choice, I think - would'nt they ?

That is an option.  There are of course problems with doing either. 
Remember that the AFS clients themselves cache read-only data.  So if 
most of your data is only being read and not written back that often, it 
might make sense to have only centrally located AFS servers.

It might also make sense to have AFS DB servers hosed locally at each 
district, although I'm not sure what would happen if the network goes 
out and quorum is lost.  Otherwise certain information will still need 
to come from the central DB server in order for AFS to function 
properly.

> i thought about every district his own fileserver with the special
> volume for them and a readonly volume in our headquarter released
> every night - so i could do the volume dump - i'm not very trusted
> with the "backup" command yet.

By default, the AFS client prefers to use readonly volumes, so if you 
create a replica of a volume, the data will immediately become readonly. 
You can however manualy force the mount point to be RW (-rw option to fs 
mkm) and this way you can have an RW volume in each local district and 
still be able to clone the data to other servers using vos release.  All 
volume rights must go to directly to the RW volume.  The AFS client does 
not detect when you want to make a write and find the proper RW volume. 
You can modify the code to make it behave that way, but there are 
reasons for not doing that.

You could also run an AFS cell in each district and vos dump 
incrementals from the volumes and copy them to a central fileserver and 
restore them there.  Volumes do not have to be restored to the cell they 
were dumped from.

However, you might simply be better off using a more common network 
filesystem like NFS or samba and using something like rsync to backup 
the data nightly.  You mentioned a VPN.  Since the network link is 
already encrypted, you don't require filesystem encryption?  Or do you?

I've not used the AFS "backup" command.  Ever.

> but I don't know how I could set up the "shared" part where every
> user in every district can read/write to it.

You just set ACLs with the fs setacl command.  Unless I again am 
misunderstanding what you mean by "shared."

> so if you tell me which information you need to know I can provide

You might want to read through:
http://www.dementia.org/twiki/pub/AFSLore/FurtherReading/NFS_on_steroids
and
http://reports-archive.adm.cs.cmu.edu/anon/home/anon/itc/CMU-ITC-053.pdf

Those are old, but are short and explain how volumes work.  It seems as 
though you are trying to use AFS like NFS or samba, creating a single 
large share point and allowing everyone to write in it.  This is not the 
best way to use AFS, although it mostly works.  Replicating single large 
volumes can take a long time, especially over slow links.

Can you describe a "distrcit office" in more detail?  How many users? 
Is there technical staff there to diagnose problems with an AFS server, 
if they occur?  Are the offices always connected to the network?  What 
type of connection do they have?  Bandwidth?  Latency?  Do users work 
out of different offices at different times?  How much data do you need 
to store at each district?  Do you use Kerberos 5 currently within your 
organization?  A single realm?  Or a realm per district?  What kind of 
budget do you have for hardware and software for this project?  How 
reliable is the network link?  Do you have any off-site backup or 
disaster recovery requirements?  Any specific features that the project 
MUST do?  Any features that the project SHOULD do?  Anything else that 
would be nice to do?  How much data are we talking about here?  Total 
and at each district?  What is the "change rate" of your data?  How much 
data is modified per day or per week as a percentage of the total data? 
What are your projected storage requirements for 1 year?  2 years?  3 
years?  5 years?  10 years?  What are you using right now for file 
sharing?  What are the current problems that you are experiencing?

Why did you decide to look at AFS in the first place?

<<CDC