[OpenAFS] 1.3.85 Still Crashing w/ Fedora 3 (Linux 2.6.11)

Jason McCormick jasonmc@cert.org
Mon, 18 Jul 2005 22:50:08 -0400


--On Monday, July 18, 2005 10:17:32 PM -0400 chas@cmf.nrl.navy.mil wrote:

> In message <782E00341DB4AC458007D08A@rowan.wv.cc.cmu.edu>,Jason McCormick
> writes: before the kernel blows up. It prints the first hex address of
> what looks
>> like a memory location and then dies hard.  Looks like:
>> 
>> [<c01bd35c>]
> 
> its useful is you managed to grab a symbol table before the crash.
> after you load afs, save the output from ksyms -a and you should
> be able to convert the eip to something useful.

I'll do this tomorrow to see if I can get a better result.


> 
>> I'm nursing a theory that the bug only triggers in combination with
>> VMware.  I tried for a few days to crash one of the the test servers in
>> our
> 
> i dont know how vmware works with the linux kernel.  do you run a 
> special version of the linux kernel?

VMware has its own kernel modules that you compile for each kernel.
Basically they provide hooks into system-level operations like networking,
etc.  I've used VMware for years and we use VMware with AFS 1.2-based
systems on AS3 and haven't had problems with it.
 
>> There's also a filesystem unmount bug that I've seen sporadically (about
>> 50% of the time), sometimes with a message of:
> 
> on the vmware system or both?  i have seen this same bug but very rarely.
> i am unable to duplicate with regularity.

This occurs on all hosts -- at least the non-oops error.  I've seen the
oops on 3 different machines (2 of which don't run VMware at all).  I don't
have anything running in VMware with this version of AFS at the present
time.  This oops was taking from one of the build/development hosts.  I
cannot produce it with regularity either, but it's not infrequent either.

> i could send along a patch that would give a little more info about
> the inode if you would try it.

Sure.

-- Jason