[OpenAFS] 1.4.4 client on EL3: panic in afs_HashOutDcache

Stephan Wiesand Stephan.Wiesand@desy.de
Wed, 11 Apr 2007 10:44:53 +0200 (CEST)


One of our systems panicked two times within 2 hours yesterday, at the 
same location in the OpenAFS client. I attached the kernel's last words 
below.

This is an SL3 system, kernel 2.4.21-47.0.1.ELsmp, i686. The client build 
has two patches on top of 1.4.4: linux-task-pointer-safety-20070320 from 
CVS, and the one from
https://lists.openafs.org/pipermail/openafs-devel/2007-March/014985.html

The cache is on an ext3 filesystem:
# df -H /afs_cache
Filesystem             Size   Used  Avail Use% Mounted on
/dev/vg00/afs_cache   1.1GB  443MB  561MB  45% /afs_cache
# df -i /afs_cache
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/vg00/afs_cache   131072   21868  109204   17% /afs_cache
# fs getcach
AFS using 413392 of the cache's available 699000 1K byte blocks.

afsd args: -dynroot -fakestat -afsdb -nosettime -stat 2800 -dcache 2400 
-daemons 5 -volumes 128

At the time of both panics, it seems the same user was transferring 
several files of O(100MB) in parallel to a remote site, probably 
with rsync.

Any ideas how to fix this?

Thanks,
   Stephan


dcache hc<1>Unable to handle kernel NULL pointer dereference at virtual address 00000000
  printing eip: 
f8a6da50 
*pde = 13ad0001 
*pte = 00000000 
Oops: 0002 
panfs nfs lockd sunrpc openafs netconsole 3c59x mii microcode ohci1394 ieee1394 loop keybdev mousedev hid input usb-uhci usbcore ext3 jbd lvm-mod aic7xxx disk 
CPU:    3 
EIP:    0060:[<f8a6da50>]    Tainted: P 
EFLAGS: 00210282

EIP is at osi_Panic [openafs] 0x20 (2.4.21-47.0.1.ELsmp/i686) 
eax: 00000009   ebx: f8b74000   ecx: 00200046   edx: c0388e98 
esi: f8c43080   edi: 00027b31   ebp: 00000002   esp: f2a39e04 
ds: 0068   es: 0068   ss: 0068 
Process afs_cachetrim (pid: 980, stackpage=f2a39000) 
Stack: f8a9365b 00000001 00000000 f8c43080 f8c43080 00027b31 00000002 f8a2d9ef
        f8a9365b 00000001 00000000 f8c43080 f8c43080 ed689680 00027b31 f8a2d6a8
        f8c43080 00000000 00000000 00000937 f2a39e94 c0123410 00000000 116c94c6 
Call Trace:   [<f8a9365b>] .rodata.str1.1 [openafs] 0x11f (0xf2a39e04) 
[<f8a2d9ef>] afs_HashOutDCache [openafs] 0x7f (0xf2a39e20) 
[<f8a9365b>] .rodata.str1.1 [openafs] 0x11f (0xf2a39e24) 
[<f8a2d6a8>] afs_GetDownD [openafs] 0x528 (0xf2a39e40) 
[<c0123410>] load_balance [kernel] 0x30 (0xf2a39e58) 
[<f8a2cd2e>] afs_CacheTruncateDaemon [openafs] 0x12e (0xf2a39fa0) 
[<f8a7f9f0>] afsd_thread [openafs] 0x3e0 (0xf2a39fe0) 
[<f8a7f610>] afsd_thread [openafs] 0x0 (0xf2a39fe4) 
[<c01095cd>] kernel_thread_helper [kernel] 0x5 (0xf2a39ff0)

Code: c6 05 00 00 00 00 00 83 c4 1c c3 90 8d 74 26 00 b8 4f 42 a9

Kernel panic: Fatal exception


dcache hc<1>Unable to handle kernel NULL pointer dereference at virtual address 00000000
  printing eip: 
f8a6da50 
*pde = 2301d001 
*pte = 4ecaa067 
Oops: 0002 
panfs nfs lockd sunrpc openafs netconsole 3c59x mii microcode ohci1394 ieee1394 loop keybdev mousedev hid input usb-uhci usbcore ext3 jbd lvm-mod aic7xxx disk 
CPU:    2 
EIP:    0060:[<f8a6da50>]    Tainted: P 
EFLAGS: 00010282

EIP is at osi_Panic [openafs] 0x20 (2.4.21-47.0.1.ELsmp/i686) 
eax: 00000009   ebx: f8b74000   ecx: 00000046   edx: c0388e98 
esi: f8c159a0   edi: 0000a416   ebp: 00000002   esp: f3a2fe04 
ds: 0068   es: 0068   ss: 0068 
Process afs_cachetrim (pid: 987, stackpage=f3a2f000) 
Stack: f8a9365b 00000002 00000000 f8c159a0 f8c159a0 0000a416 00000002 f8a2d9ef
        f8a9365b 00000002 00000000 f8c159a0 f8c159a0 f8c15a14 0000a416 f8a2d6a8
        f8c159a0 00000000 00000000 00000000 00000001 f3a2fe74 00000000 463b3beb 
Call Trace:   [<f8a9365b>] .rodata.str1.1 [openafs] 0x11f (0xf3a2fe04) 
[<f8a2d9ef>] afs_HashOutDCache [openafs] 0x7f (0xf3a2fe20) 
[<f8a9365b>] .rodata.str1.1 [openafs] 0x11f (0xf3a2fe24) 
[<f8a2d6a8>] afs_GetDownD [openafs] 0x528 (0xf3a2fe40) 
[<f8a2cd2e>] afs_CacheTruncateDaemon [openafs] 0x12e (0xf3a2ffa0) 
[<f8a7f9f0>] afsd_thread [openafs] 0x3e0 (0xf3a2ffe0) 
[<f8a7f610>] afsd_thread [openafs] 0x0 (0xf3a2ffe4) 
[<c01095cd>] kernel_thread_helper [kernel] 0x5 (0xf3a2fff0)

Code: c6 05 00 00 00 00 00 83 c4 1c c3 90 8d 74 26 00 b8 4f 42 a9

Kernel panic: Fatal exception

-- 
Stephan Wiesand
   DESY - DV -
   Platanenallee 6
   15738 Zeuthen, Germany