[OpenAFS] mount points and replication problems

Todd M. Lewis Todd_Lewis@unc.edu
Tue, 12 Jul 2005 09:06:26 -0400

CÚdric CACHAT wrote:
> Hello,
> this is the first time I write and I am pretty new to AFS. I have a 
> question regarding mount points in AFS.
> Here is what I'm trying to achieve:
> I want all my users to have their home directory in AFS, the plan is to 
> set an AFS tree looking like:
> /afs/cell/usr/homes/<user1>
> I created the following volumes on my primary server:
> root.afs
> root.cell
> common.usr
> common.homes
> user.user1
> and then I mounted them using the fs command :
> *# fs mkm /afs/cell/usr common.usr* and so on... (I didnt use *# fs mkm 
> /afs/_._cell/usr common.usr *maybe my problem comes from here?)
> So far everything is under control.
> Since I have many sites, I have set up one AFS server on each site. 
> Because all users don't work on the same site I decided to create user.* 
> volumes on their closest server, so I created volume user.user1 on the 
> primary server and user.user2 on the secondary server.

Up to this point, you've done everything wonderfully. Congratulations.

> Without any replication it works perfectly if BOTH servers are running. 
> If one is down, say the master, then acces to a user's home-dir is 
> impossible.

Now you've got to the point where you have to distinguish between 
distributed file servers (which AFS provides) vs. high availability 
(which AFS does not provide).

> Thats's were it's getting complicated for me: I then set up replication 
> to the second site so that I have:
> primary server 	secondary server
> root.afs (RW) 	root.afs (RO)
> root.cell (RW) 	root.cell (RO)
> common.usr (RW) 	common.usr (RO)
> common.homes (RW) 	common.homes (RO)
> user.user1 (RW) 	user.user1 (RO)
> user.user2 (RO)
> 	user.user2 (RW)
> Looking at the array above, if the primary server is down, user1 should 
> be able to access is home dir but Read Only whereas user2 should be able 
> to read/write to his home directory. That's exactly what I want.
> The problem is user2 can only read and not write (il I try ls 
> /afs/.cell, it hangs then says timeout). Is it normal or did I miss a thing?

This is normal. That's exactly what you would want if you had a static 
volume (containing data archives, a software package, etc.) that you 
wanted to be efficiently accessed from either site.  AFS lets you 
distribute and serve such volumes efficiently.

Home directories are not static, and are not good candidates for 
replication. The only real answer here is the same for almost any other 
file system housing a user's home directory: if you want to keep the 
directory accessible, keep the server running.

In other words, this is not the problem that AFS solves.

> Second question, I don't know what to set their homedirectory to (read 
> from LDAP at login), do I have to use /afs/_cell_/usr/homes/user1 or 
> /afs/_.cell_/usr/homes/user1.

The former. If you set it to the latter, and the server containing the 
RW volume (the one you get when you use the '.') is down, you're still 
in the same boat: the user's home directory is still not available.

[Completely aesthetic aside: I'd make it "home", not "homes". But hey, 
it's your cell!]

> If I use the former, when both servers are running they can't write to 
> their directory, they have to cd to /afs/.cell/usr/homes/user1 in order 
> to write which is not practical; if I use the latter, it works all right 
> when both servers are running but when the primary is down, it fails to 
> acces the home directory (server timeout, the branch /afs/.cell is down).

This has to do with the servers not being able to reach a quorum when 
one goes down. You need three or more servers for that to work.

But that isn't going to solve the home directory accessibility problem.
   * Any RW volumes on a down server are going to be inaccessible.
   * Don't replicate volumes with dynamic data (like home directories).
   * The highest availability of AFS volumes is achieved through the use 
of reliable servers, not through replication.

> Did someone ever try to set up such a network, or is it impossible?

Many have tried to solve the availability problem with replication. None 
have succeeded.

> Could you tell me then how should I mount my tree?

You had it right to start with. The problem is you don't have enough 
servers to reach a quorum when a server goes down.

> I think my problems come from the .cell and cell, I don't quite 
> understand the impact it has on the rest of the tree.

As you descend your tree, you're looking at RO volumes unless/until you 
hit a RW volume that isn't replicated. From that point downward, you're 
looking in the RW volumes.  Using the .cell forces your topmost access 
in your tree to be RW, so you're looking at RWs from there on down. 
Generally, you want to use RO wherever possible because you can 
distribute them among servers (and, thus, distribute the load among 
servers). But some "leaf volumes" in your tree just don't lend 
themselves to replication because the files and directories are just too 
dynamic, so replication becomes impractical.

The art in setting up your cell is determining what chunks of your 
directory tree to keep in what volumes which volumes to replicate, 
where, and when. Your first shot at it as pretty good. Most sites do 
something similar. Some sites (like ours, with over 100,000 home 
directories) put in extra levels. My id is 'utoddl', so my home 
directory is '/afs/isis.unc.edu/home/u/t/utoddl'. That helps us keep 
from overloading  the 'home' directory. You might want to consider 
something one or more extra levels if you plan on having lots of users. 
But you'll need to grow your server farm to get quorums to work. With 
only two servers, you've actually increased your chances of downtime but 
bought yourself localized access to home volumes. That could be a good 
decision, but you'll want to have reliable servers and network between them.

> Thanks for your help
> CÚdric
   / Todd_Lewis@unc.edu  919-962-5273  http://www.unc.edu/~utoddl /
  /            Corduroy pillows are making headlines.            /