[OpenAFS-devel] OpenAFS on 2.4.26 ? OpenMosix ?

Terry Gliedt tpg@umich.edu
Tue, 09 Nov 2004 09:15:44 -0500


Terry Gliedt wrote:
> Harald Barth wrote:
> 
>>> I'm attempting to get OpenAFS support in an OpenMosix kernel. The 
>>> only stable OpenMosix kernel for our situation is 2.4.26. Has anyone 
>>> manged to get OpenAFS client working on a 2.4.26 kernel?
>>
>>
>>
>> I tried Mosix+Arla a long time ago but back then the Mosix stuff was so
>> i386 (endian?) dependent that it did not even compile on sparc which
>> was what I had available back then. A lot has happened since then,
>> but I don't have any newer information for you.
>>
>>> I realize the cluster nodes cannot use AFS, but it would be very
>>> useful to have AFS client available on the gateway machine.
>>
>>
>>
>> Why would the cluster nodes be limited in such a way?
> 
> 
> Oh right. Of course. I wsa thinking about our particular cluster 
> configuration in which the nodes are dedicated on a private network. 
> They do not have internet access as there is no NAT IP forwarding 
> provided. Other configurations could use regular machines on the 
> network. Sorry for the confusion.
> 
> I guess this is as good as any time to tell you all that
> 
> (1) OpenAFS 1.3.73 compiles just fine with a 2.4.26 kernel. All I needed 
> was a symlink to the hpc directory to get it to compile.
> 
> (2) So far OpenAFS and OpenMosix get along OK. I've not tested this very 
> much (I'm running right now on this configuration with X11+KDE and a few 
> other things), but so far it's not paniced. Hardly an endorsement, but I 
> plan to test this more completely today.
> 
> In any case this is further than I had thought I was going to get. I'll 
> report back in a day or so after I have a more firm conclusion.

I can now confirm the combination of a  2.4.26 kernel  + 1.3.73 OpenAFS 
works just fine. Adding OpenMosix will immediately results in this symptom:

   SSH with X11 forwarding to OpenMosix+OpenAFS machine
   Observe messages about a fail in locking .Xauthority file

What apparently is happening is that as X11 attempts to add a new entry 
to .Xauthority, it creates .Xauthority-n and presumably does a move 
which fails. This results in the user's .Xauthority "disappearing". A 
simple 'mv .Xauthority-n .Xauthority' allows X11 to work properly again.

I presume this has something to do with locking, but that's just my 
guess. I've seen other strangeness in AFS behavior also which may be 
related (or not), however the ssh scenario I mention above has been my 
lithmus test.

If anyone has advice or a patch, I'd be happy to hear about it. It seems 
OpenAFS is very close to working, but not close enough. Until then I'm 
abandoning the effort to make OpenMosix and OpenAFS on 2.4.26 work together.

-- 
=============================================================
Terry Gliedt     tpg@umich.edu       http://www.hps.com/~tpg/
Biostatistics, Univ of Michigan  Personal Email:  tpg@hps.com