[OpenAFS-devel] Re: OpenAFS on 2.4.26 ? OpenMosix ?

Terry Gliedt tpg@umich.edu
Tue, 21 Dec 2004 08:30:38 -0500


Atro Tossavainen wrote:
> On Wed, 15 Dec 2004 14:48:32 -0500, Jeffrey Hutzelman wrote:
> 
> 
>>FWIW, I have not heard of anyone getting OpenAFS and OpenMosix to work 
>>together, even to the extent that you've reported so far.  We have had 
>>several reports of failures in the past, though...
> 
> 
> We run three MOSIX (not OpenMosix, though) clusters where all users'
> home directories are on AFS and all logins are handled via AFS.
> 
> Our MOSIX clusters consist of a head node that the users log on to and
> a number of diskless slave nodes that act merely as spare computing
> resources for the head node.  The MOSIX traffic is always on a private
> internal network with no visibility to/from the real world, so the
> slave nodes have no idea of AFS.  Users start all jobs on the head node
> and MOSIX takes care of distributing them across the nodes.  When the
> jobs need to do I/O they migrate back to the head node.  (Yes, this is
> a bottleneck.)
> 
> We haven't had any problems that I could pinpoint to interactions
> between MOSIX and AFS.  What kind of failures have been reported?
> 

Your configuration is exactly like ours (2 gateway nodes, 24 slave 
machines on a private internal network).

Login using AFS works fine. Keeping $HOME out of AFS makes for a more 
stable world.

Reading using AFS seems to always work.  Writing works most times, but 
will on occaission result in a segment fault (in cp for instance). I 
suspect this happens when the file must be fetched from the AFS server, 
but can't be sure.

At one point I had the gateway machine in my office and I tried to login 
via xdm to my AFS account. This generated various errors, starting with 
the X11 lock. Login never worked reasonably, even though the same exact 
setup works fine with more conventional Linux machines.

What MOSIX are you running? Version etc? OpenAFS version?

When you launch the AFS daemon, do you make any attempt to pin the 
daemon to the gateway node? Seems to me that if AFS daemons get 
migrated, it'd not be a good thing.

-- 
=============================================================
Terry Gliedt     tpg@umich.edu       http://www.hps.com/~tpg/
Biostatistics, Univ of Michigan  Personal Email:  tpg@hps.com