[OpenAFS] 1.3.85 Still Crashing w/ Fedora 3 (Linux 2.6.11)

chas@cmf.nrl.navy.mil chas@cmf.nrl.navy.mil
Mon, 18 Jul 2005 22:17:32 -0400

In message <782E00341DB4AC458007D08A@rowan.wv.cc.cmu.edu>,Jason McCormick writes:
before the kernel blows up. It prints the first hex address of what looks
>like a memory location and then dies hard.  Looks like:

its useful is you managed to grab a symbol table before the crash.
after you load afs, save the output from ksyms -a and you should
be able to convert the eip to something useful.

> I'm nursing a theory that the bug only triggers in combination with
>VMware.  I tried for a few days to crash one of the the test servers in our

i dont know how vmware works with the linux kernel.  do you run a 
special version of the linux kernel?

> There's also a filesystem unmount bug that I've seen sporadically (about
>50% of the time), sometimes with a message of:

on the vmware system or both?  i have seen this same bug but very rarely.
i am unable to duplicate with regularity.

>Unmounting file systems:  Failed to invalidate all pages on inode 0xf5f85800
>This is sometimes (but not always) accompanied by an oops. That oops is:
>slab error in kmem_cache_destroy(): cache `afs_inode_cache': Can't free all
>[<c0146a91>] kmem_cache_destroy+0xdc/0x132
>[<f94b4ec8>] afs_destroy_inodecache+0xd/0x25 [libafs]
>[<f94c6c19>] cleanup_module+0x19/0x25 [libafs]
>[<c0136a25>] sys_delete_module+0x148/0x166
>[<c0151080>] unmap_vma_list+0xe/0x17
>[<c01513e1>] do_munmap+0xff/0x143
>[<c0103f0f>] syscall_call+0x7/0xb

if all the inodes dont go away, then the kmem_cache cannot be destroyed.  
the only reason an inode wouldnt go away is that there is a inode with
a refcount > 1.  that should be unlikely since shutting down the 
filesystem should not be possible while there is an outstanding reference.
i could send along a patch that would give a little more info about
the inode if you would try it.