[OpenAFS] AFS design question: implementing AFS over a highly-distributed, low-bandwidth network

anne salemme anne.salemme@dartmouth.edu
Thu, 15 Jan 2009 14:19:51 -0500


general advice:

- make sure the network connectivity between your three AFS "database 
servers" is always up...they depend on the network to communicate with 
each other, and if they are always up and always reliable, it will 
enhance the perceived performance of afs

- if most of the users do mostly reading, and mostly smallish files, you 
might get by using volume replication at each site, leaving the RW 
volume wherever it is, and relying on some tricks to keep the user 
confused during writes...i think, in general, the idea of moving RW 
volumes all the time is a bad one...it will put big loads on your 
fileservers if you have lots of users

- check the top of your afs tree and make sure root.afs, root.cell and 
probably most other volumes at the top of the tree contain only 
mountpoints, and are all replicated, and don't change them much...in 
other words, you want stability at the top of the tree, to reduce the 
changes that clients have to pay attention to

- do your vos operations, backups, etc. in a coordinated way so you can 
predict when they are happening, and don't end up hosing your servers by 
doing multiple simulataneous operations on the same volumes

how many users? what kinds of files and what kinds of usage? how much 
sharing? these can make a difference in how you set things up, too. (i 
worked on a cell once with clients in sweden, south america, and on the 
european continent...one server at each site, whenever the south 
american site became master db server, followed by transatlantic link 
going down...the european users weren't happy....basically, as long as 
you have good network connectivity between client systems and server 
systems, there is no need for physical proximity...)

best of luck.

anne

Chaz Chandler wrote:
> Hello all!
>
> I am attempting to implement OpenAFS across a VPN with limited bandwidth between sites but relatively mobile users who expect to have their data available (and writable) at whichever site they are currently located.
>
> The issue I am running up against is how to organize the AFS volume structure so that things like user home dirs, collaboration/group dirs, and writable system items (like a Windows profile, for instance) are optimally maintained in AFS.  
>
> The set-up is:
> 1) Each site has an OpenAFS server (each with vl, pt, bu, vol, fileservers & salvager).  Currently, v1.4.8 on Linux 2.6.
> 2) Clients computers are a mix of Windows XP, OpenBSD, and Linux.  1.5 clients for windows, 1.4 clients for Linux, and native openbsd clients.
> 3) All sites are connected in a full mesh VPN (max of about 30KB/s for each link)
> 4) There's about 600GB of data at the moment.  Although most of it doesn't need to be writable most of the time, things that are frequently written are not currently segregated from static or infrequently-written files/dirs.  Perhaps only a few gigs change on a weekly basis.
> 5) Users move from site to site, but once there usually spend several weeks.  However, two sites are physically very close and users move between them more frequently (sometimes daily) although the link bandwidth is the same as the others.
> 6) We have a pretty standard AFS volume layout: separate volumes for each user, a few large volumes with relatively static content, a few volumes for groups to share.
> 7) Currently, volume releases are done manually.
> 8) When a user changes locations for a long stretch, we move their R/W user volume to the new location (electronically, not physically), a process which is labor- and time-intensive and usually has at least one snafu along the way.
> 9) We have been unable to come up with a working implementation of roaming windows profiles on AFS.
>
> I'm seeking recommendations on:
> 1) How others have set up a regular release schedule to keep a large amount of data synced over a slow network (custom scripts, I assume, but is there a repository of these things and what are the general mechanics and best practices here?)
> 2) What sort of volume layout would one recommend, and how should frequently-updated data be stored?  Take, for instance, three examples:
> - A software repository: large volume with relatively static contents, occasionally has large additions or subtractions when a new piece of software is added or an old one removed.  Ideally, these updates should be able to be accomplished from any location.  Users don't need to write to it, but may need to read from it frequently at LAN speeds.
> - A collaboration dir: several users read and write a small amount (10s of MBs) on a daily basis from different locations simultaneously, but they expect LAN-type performance.
> - A user dir: large amounts of data updated from a single location, but user may move to any other site at any time, potentially with up to a day of transit time in which a volume could be moved to the destination site.
> 3) Any concrete recommendations on how to properly implement windows integration with AFS (especially folder redirection and roaming profiles on AFS).  Yes, I've read the '04 and '05 best practices, however they are now quite old and did not work for me.
>
> I've been lurking on this list for a while now and have come to the conclusion that while there are a few very knowledgeable and experienced folks in the AFS community, there are not any good, current, and comprehensive AFS information repositories out there.  The list archives are the best option, but I find them almost impossible to use unless I know the exact phrase I'm looking for.  Is there something I'm missing?
>
> Cheers,
> -Chaz
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>