[OpenAFS] mount points and replication problems
Todd M. Lewis
Todd_Lewis@unc.edu
Tue, 12 Jul 2005 09:06:26 -0400
Cédric CACHAT wrote:
> Hello,
>
> this is the first time I write and I am pretty new to AFS. I have a
> question regarding mount points in AFS.
> Here is what I'm trying to achieve:
> I want all my users to have their home directory in AFS, the plan is to
> set an AFS tree looking like:
> /afs/cell/usr/homes/<user1>
> I created the following volumes on my primary server:
> root.afs
> root.cell
> common.usr
> common.homes
> user.user1
> and then I mounted them using the fs command :
> *# fs mkm /afs/cell/usr common.usr* and so on... (I didnt use *# fs mkm
> /afs/_._cell/usr common.usr *maybe my problem comes from here?)
> So far everything is under control.
>
> Since I have many sites, I have set up one AFS server on each site.
> Because all users don't work on the same site I decided to create user.*
> volumes on their closest server, so I created volume user.user1 on the
> primary server and user.user2 on the secondary server.
Up to this point, you've done everything wonderfully. Congratulations.
> Without any replication it works perfectly if BOTH servers are running.
> If one is down, say the master, then acces to a user's home-dir is
> impossible.
Now you've got to the point where you have to distinguish between
distributed file servers (which AFS provides) vs. high availability
(which AFS does not provide).
> Thats's were it's getting complicated for me: I then set up replication
> to the second site so that I have:
>
> primary server secondary server
> root.afs (RW) root.afs (RO)
> root.cell (RW) root.cell (RO)
> common.usr (RW) common.usr (RO)
> common.homes (RW) common.homes (RO)
> user.user1 (RW) user.user1 (RO)
> user.user2 (RO)
> user.user2 (RW)
>
>
> Looking at the array above, if the primary server is down, user1 should
> be able to access is home dir but Read Only whereas user2 should be able
> to read/write to his home directory. That's exactly what I want.
> The problem is user2 can only read and not write (il I try ls
> /afs/.cell, it hangs then says timeout). Is it normal or did I miss a thing?
This is normal. That's exactly what you would want if you had a static
volume (containing data archives, a software package, etc.) that you
wanted to be efficiently accessed from either site. AFS lets you
distribute and serve such volumes efficiently.
Home directories are not static, and are not good candidates for
replication. The only real answer here is the same for almost any other
file system housing a user's home directory: if you want to keep the
directory accessible, keep the server running.
In other words, this is not the problem that AFS solves.
> Second question, I don't know what to set their homedirectory to (read
> from LDAP at login), do I have to use /afs/_cell_/usr/homes/user1 or
> /afs/_.cell_/usr/homes/user1.
The former. If you set it to the latter, and the server containing the
RW volume (the one you get when you use the '.') is down, you're still
in the same boat: the user's home directory is still not available.
[Completely aesthetic aside: I'd make it "home", not "homes". But hey,
it's your cell!]
> If I use the former, when both servers are running they can't write to
> their directory, they have to cd to /afs/.cell/usr/homes/user1 in order
> to write which is not practical; if I use the latter, it works all right
> when both servers are running but when the primary is down, it fails to
> acces the home directory (server timeout, the branch /afs/.cell is down).
This has to do with the servers not being able to reach a quorum when
one goes down. You need three or more servers for that to work.
But that isn't going to solve the home directory accessibility problem.
* Any RW volumes on a down server are going to be inaccessible.
* Don't replicate volumes with dynamic data (like home directories).
* The highest availability of AFS volumes is achieved through the use
of reliable servers, not through replication.
> Did someone ever try to set up such a network, or is it impossible?
Many have tried to solve the availability problem with replication. None
have succeeded.
> Could you tell me then how should I mount my tree?
You had it right to start with. The problem is you don't have enough
servers to reach a quorum when a server goes down.
> I think my problems come from the .cell and cell, I don't quite
> understand the impact it has on the rest of the tree.
As you descend your tree, you're looking at RO volumes unless/until you
hit a RW volume that isn't replicated. From that point downward, you're
looking in the RW volumes. Using the .cell forces your topmost access
in your tree to be RW, so you're looking at RWs from there on down.
Generally, you want to use RO wherever possible because you can
distribute them among servers (and, thus, distribute the load among
servers). But some "leaf volumes" in your tree just don't lend
themselves to replication because the files and directories are just too
dynamic, so replication becomes impractical.
The art in setting up your cell is determining what chunks of your
directory tree to keep in what volumes which volumes to replicate,
where, and when. Your first shot at it as pretty good. Most sites do
something similar. Some sites (like ours, with over 100,000 home
directories) put in extra levels. My id is 'utoddl', so my home
directory is '/afs/isis.unc.edu/home/u/t/utoddl'. That helps us keep
from overloading the 'home' directory. You might want to consider
something one or more extra levels if you plan on having lots of users.
But you'll need to grow your server farm to get quorums to work. With
only two servers, you've actually increased your chances of downtime but
bought yourself localized access to home volumes. That could be a good
decision, but you'll want to have reliable servers and network between them.
> Thanks for your help
>
> Cédric
--
+--------------------------------------------------------------+
/ Todd_Lewis@unc.edu 919-962-5273 http://www.unc.edu/~utoddl /
/ Corduroy pillows are making headlines. /
+--------------------------------------------------------------+