[OpenAFS-devel] Re: OpenAFS on 2.4.26 ? OpenMosix ?

Terry Gliedt tpg@umich.edu
Tue, 21 Dec 2004 09:14:40 -0500


Atro Tossavainen wrote:
> 
>>Reading using AFS seems to always work.  Writing works most times, but 
>>will on occaission result in a segment fault (in cp for instance). I 
>>suspect this happens when the file must be fetched from the AFS server, 
>>but can't be sure.
> 
> 
> We have never seen this.  Have you confirmed that it's not a token
> lifetime issue?  (A segmentation fault should not occur even then,
> though.)

Definately not a token expiring. All the errors I've seen happen within 
a short time of logging in. And you are correct, a token expiring only 
results in a permission denied message (normally).

>>What MOSIX are you running? Version etc? OpenAFS version?
> 
> 
> The latest we've been running is MOSIX 1.12.0 for Linux 2.4.27, though
> now that you mention it, I notice that 1.12.1 for 2.4.28 has become
> available a while ago.  We're on OpenAFS 1.2.11 on the clients.  The
> servers are Transarc AFS, if it makes any difference, I doubt that it
> would.  We've been doing this for two years now, starting from Linux
> kernel 2.4.20 and whatever MOSIX was the proper release for it, 1.8.0
> probably.

We are on a 2.4.27 kernel too, although all 2.4 kernels behave the same 
for us. We too are using Transarc servers, but I agree, this should make 
no difference. Our cluster has been active for about 2 years also, but 
just recently we added the second gateway and added AFS access.

>>When you launch the AFS daemon, do you make any attempt to pin the 
>>daemon to the gateway node? Seems to me that if AFS daemons get 
>>migrated, it'd not be a good thing.
> 
> 
> Yes, there is a "echo 1 > /proc/$$/lock" in my /etc/init.d/afs for the
> clusters.  However, it does not seem to matter - the processes do not
> get locked.  On the other hand, they do not appear to migrate either.
> 
> The afsd processes appear as [name] in ps -f output... other processes
> that look the same appear to be kernel threads which wouldn't be migrated
> by MOSIX anyway, if I'm not totally mistaken.

I was wondering if I should use runhome in launching the AFS daemon? 
Using mtop I don't see any indication that afsd (or anything else for 
that matter) has been migrated. Still I wonder if afsd was migrated, if 
that could explain the eraticness I see.

> If you have the opportunity to set up a test cluster with MOSIX instead
> of OpenMosix, I would like to hear your experiences w.r.t. AFS.

Me too. I need to find enough extra hardware to set this up. The only 
good news is that I might be able to steal enough resources during 
Christmas week to sort this out... I'll let you know what I find out.


Thanks for your response - I appreciate hearing that at lease SOME 
configuration can be made to work.

-- 
=============================================================
Terry Gliedt     tpg@umich.edu       http://www.hps.com/~tpg/
Biostatistics, Univ of Michigan  Personal Email:  tpg@hps.com