[OpenAFS] FC3 + AFS 1.3.7x Problems (was: rought timeline for 1.4.x)

Jason McCormick jasonmc@cert.org
Mon, 13 Dec 2004 10:21:00 -0500


Note: I've changed the subject to "FC3 + AFS 1.3.7x Problems" to more
accurately reflect the discussion.

--On Friday, December 10, 2004 05:11:04 PM -0500 Derrick J Brashear
<shadow@dementia.org> wrote:

> On Fri, 10 Dec 2004, Jason McCormick wrote:
> 
>>  * Inability to unmount /usr/vice/cache (or / if it's not a separate
>> partition).
> 
> this implies one of the "special" file opens is somehow being leaked.
> (inside the kernel)
> 
> Is this e.g.
> umount /afs
> afsd -shutdown
> rmmod libafs
> ?

We're using the default afsd init script provided by the releases.  The
only difference is we're patching it (as shown in last post) to remove
calls to the module loading scripts.  Doing it by hand (as shown above) has
the same "device is busy" when you try to unmount the cache.

>>  * Accessing an AFS volume over our VPN results in an immediate kernel
>> panic.  The panic message reports many "Unable to handle kernel NULL
>> pointer deference at virtual address" errors followed by "Recursive die()
>> failure, output suppressed" and "<0>Kernel panic - not syncing: Fatal
>> exception in interrupt".  This is present only on 1 of 2 laptops running
>> FC3, but is 100% repeatable on the failing laptop.
> 
> No oops, I assume.

No that's all that prints out.  The above errors were from an unusually
verbose crash.  Generally the crash prints one line that just says
"Warning: kfree_skb on hard IRQ <address>" and then completely locks the
machine.

>>  * Copying large files (~450Mb0 into AFS from non-AFS partitions results
>> in a kernel oops.  
> 
> Screams stack overflow, but the backtrace is nonsensical. Recompile
> module with -fomit-frame-pointer?

I'll try and work on this today.  Is there a quick way to do this?

>>  * Random cache consistency problems.  
> 
> Ok. We fixed only one thing which might affect this, and I doubt it's it.

Is there further debugging we can do?

-- 
Jason McCormick
CERT Infrastructure Team
jasonmc@cert.org ** 412-268-7961