[OpenAFS] a noobs question and problems on a new cell
Christopher D. Clausen
cclausen@acm.org
Mon, 7 May 2007 02:29:41 -0500
Adnoh <adnoh@users.sourceforge.net> wrote:
> By "Backup" I mean the typical type of backup - one snapshot of the
> data at the weekend and some
> incremental backups every day. i don't think this is posible with
> backup volumes - so i thought about volume dumps.
Backup volumes must reside on the same server and partition as the RW
volume. You could use backup volumes, dump them, and copy the dump
files to a central location, even into a different AFS volume on another
server. That isn't the most efficient backup program though.
> i don't want to have all the volumes in our headquarter. so every
> time a user openes his word-doc or similar it would be completly
> transfered over our VPN - and I can hear the people crying "our
> fileservers are too slow !" so seperate fileservers in every district
> would be a good choice, I think - would'nt they ?
That is an option. There are of course problems with doing either.
Remember that the AFS clients themselves cache read-only data. So if
most of your data is only being read and not written back that often, it
might make sense to have only centrally located AFS servers.
It might also make sense to have AFS DB servers hosed locally at each
district, although I'm not sure what would happen if the network goes
out and quorum is lost. Otherwise certain information will still need
to come from the central DB server in order for AFS to function
properly.
> i thought about every district his own fileserver with the special
> volume for them and a readonly volume in our headquarter released
> every night - so i could do the volume dump - i'm not very trusted
> with the "backup" command yet.
By default, the AFS client prefers to use readonly volumes, so if you
create a replica of a volume, the data will immediately become readonly.
You can however manualy force the mount point to be RW (-rw option to fs
mkm) and this way you can have an RW volume in each local district and
still be able to clone the data to other servers using vos release. All
volume rights must go to directly to the RW volume. The AFS client does
not detect when you want to make a write and find the proper RW volume.
You can modify the code to make it behave that way, but there are
reasons for not doing that.
You could also run an AFS cell in each district and vos dump
incrementals from the volumes and copy them to a central fileserver and
restore them there. Volumes do not have to be restored to the cell they
were dumped from.
However, you might simply be better off using a more common network
filesystem like NFS or samba and using something like rsync to backup
the data nightly. You mentioned a VPN. Since the network link is
already encrypted, you don't require filesystem encryption? Or do you?
I've not used the AFS "backup" command. Ever.
> but I don't know how I could set up the "shared" part where every
> user in every district can read/write to it.
You just set ACLs with the fs setacl command. Unless I again am
misunderstanding what you mean by "shared."
> so if you tell me which information you need to know I can provide
You might want to read through:
http://www.dementia.org/twiki/pub/AFSLore/FurtherReading/NFS_on_steroids
and
http://reports-archive.adm.cs.cmu.edu/anon/home/anon/itc/CMU-ITC-053.pdf
Those are old, but are short and explain how volumes work. It seems as
though you are trying to use AFS like NFS or samba, creating a single
large share point and allowing everyone to write in it. This is not the
best way to use AFS, although it mostly works. Replicating single large
volumes can take a long time, especially over slow links.
Can you describe a "distrcit office" in more detail? How many users?
Is there technical staff there to diagnose problems with an AFS server,
if they occur? Are the offices always connected to the network? What
type of connection do they have? Bandwidth? Latency? Do users work
out of different offices at different times? How much data do you need
to store at each district? Do you use Kerberos 5 currently within your
organization? A single realm? Or a realm per district? What kind of
budget do you have for hardware and software for this project? How
reliable is the network link? Do you have any off-site backup or
disaster recovery requirements? Any specific features that the project
MUST do? Any features that the project SHOULD do? Anything else that
would be nice to do? How much data are we talking about here? Total
and at each district? What is the "change rate" of your data? How much
data is modified per day or per week as a percentage of the total data?
What are your projected storage requirements for 1 year? 2 years? 3
years? 5 years? 10 years? What are you using right now for file
sharing? What are the current problems that you are experiencing?
Why did you decide to look at AFS in the first place?
<<CDC