[OpenAFS] 1.4.4 client on EL3: panic in afs_HashOutDcache
Derrick J Brashear
shadow@dementia.org
Wed, 11 Apr 2007 04:56:59 -0400 (EDT)
On Wed, 11 Apr 2007, Stephan Wiesand wrote:
> One of our systems panicked two times within 2 hours yesterday, at the same
> location in the OpenAFS client. I attached the kernel's last words below.
>
> This is an SL3 system, kernel 2.4.21-47.0.1.ELsmp, i686. The client build has
> two patches on top of 1.4.4: linux-task-pointer-safety-20070320 from CVS, and
> the one from
> https://lists.openafs.org/pipermail/openafs-devel/2007-March/014985.html
afs_HashOutDCache has
/* if this guy is in the hash table, pull him out */
if (adc->f.fid.Fid.Volume != 0) {
i = DCHash(&adc->f.fid, adc->f.chunk);
us = afs_dchashTbl[i];
if (us == adc->index) {
..
} else {
/* somewhere on the chain */
while (us != NULLIDX) {
if (afs_dcnextTbl[us] == adc->index) {
/* found item pointing at the one to delete */
afs_dcnextTbl[us] = afs_dcnextTbl[adc->index];
break;
}
us = afs_dcnextTbl[us];
}
if (us == NULLIDX)
osi_Panic("dcache hc");
so basically you appear to have an unhashed dcache entry. Either there's a
locking bug or something is becoming erroneously unhashed.
How reproducible is it?
> dcache hc<1>Unable to handle kernel NULL pointer dereference at virtual
> address 00000000
> printing eip: f8a6da50 *pde = 13ad0001 *pte = 00000000 Oops: 0002 panfs nfs
> lockd sunrpc openafs netconsole 3c59x mii microcode ohci1394 ieee1394 loop
> keybdev mousedev hid input usb-uhci usbcore ext3 jbd lvm-mod aic7xxx disk
> CPU: 3 EIP: 0060:[<f8a6da50>] Tainted: P EFLAGS: 00210282
>
> EIP is at osi_Panic [openafs] 0x20 (2.4.21-47.0.1.ELsmp/i686) eax: 00000009
> ebx: f8b74000 ecx: 00200046 edx: c0388e98 esi: f8c43080 edi: 00027b31
> ebp: 00000002 esp: f2a39e04 ds: 0068 es: 0068 ss: 0068 Process
> afs_cachetrim (pid: 980, stackpage=f2a39000) Stack: f8a9365b 00000001
> 00000000 f8c43080 f8c43080 00027b31 00000002 f8a2d9ef
> f8a9365b 00000001 00000000 f8c43080 f8c43080 ed689680 00027b31
> f8a2d6a8
> f8c43080 00000000 00000000 00000937 f2a39e94 c0123410 00000000
> 116c94c6 Call Trace: [<f8a9365b>] .rodata.str1.1 [openafs] 0x11f
> (0xf2a39e04) [<f8a2d9ef>] afs_HashOutDCache [openafs] 0x7f (0xf2a39e20)
> [<f8a9365b>] .rodata.str1.1 [openafs] 0x11f (0xf2a39e24) [<f8a2d6a8>]
> afs_GetDownD [openafs] 0x528 (0xf2a39e40) [<c0123410>] load_balance [kernel]
> 0x30 (0xf2a39e58) [<f8a2cd2e>] afs_CacheTruncateDaemon [openafs] 0x12e
> (0xf2a39fa0) [<f8a7f9f0>] afsd_thread [openafs] 0x3e0 (0xf2a39fe0)
> [<f8a7f610>] afsd_thread [openafs] 0x0 (0xf2a39fe4) [<c01095cd>]
> kernel_thread_helper [kernel] 0x5 (0xf2a39ff0)