[OpenAFS] Kernel oops in client machines

Craig Gallek cgallek@gmail.com
Fri, 08 Apr 2005 22:04:40 -0400


I've recently setup an AFS cell on a new server.  The file system works
beautifully on that machine.  However, all of the client machines in my
network are experiencing the same kernel oops while using the file
system (included below).  The problem doesn't happen right away, only
after a long set of quick read/write operations such as an ls or du of a
large directory.

Since this problem is not occurring on the server machine, and all of
the machines in my network have a similar 2.6.10 kernel configurations,
I'm guessing this is a network related issue.  The problem is
reproducible with all of my firewalls turned off.

I believe the problem may be related to the issue in the following
thread:
https://lists.openafs.org/pipermail/openafs-info/2004-November/015446.html

I am also using version 1.3.74 of the software from Debian.  When I
first started investigating this problem, I found that localhost was, in
fact, showing up in the VLDB with the vos listaddrs command.  I created
the necessary NetRestrict and NetInfo files, and the correct information
is now displayed via vos listaddrs.  However, I had already created all
of the volumes on my system before I noticed this problem.  My theory is
that the localhost address is still, in someway, associated with the
volumes that I had created.  I'm not sure how to confirm or remedy this
problem though.

Any help would be much appreciated.
Craig

Unable to handle kernel paging request at virtual address 0007752c
 printing eip:
c01304c1
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: libafs fglrx ipt_REJECT ipt_state ipt_limit
iptable_filter ip_tables
CPU:    0
EIP:    0060:[bit_waitqueue+49/80]    Tainted: P      VLI
EFLAGS: 00010206   (2.6.10)
EIP is at bit_waitqueue+0x31/0x50
eax: 72254383   ebx: 00077224   ecx: 00000020   edx: 00000003
esi: e0b8021c   edi: e098f220   ebp: 00000000   esp: d299bbf4
ds: 007b   es: 007b   ss: 0068
Process du (pid: 2508, threadinfo=d299a000 task=d399c520)
Stack: 00000003 c0130473 e0b800f8 e0a54280 c0173130 e0b800f8 00000001
c0238e81
       e0b800f8 e0b800f8 c0173353 e0b800f8 c0405404 d299a000 e0a1e9a8
e0b800f8
       00000000 e0b800f8 e0c8a538 e0a2627f e0b800f8 d299bd04 0000000c
d299bdd8
Call Trace:
 [wake_up_bit+19/48] wake_up_bit+0x13/0x30
 [pg0+542454400/1068487680] afs_delete_inode+0x0/0xb0 [libafs]
 [generic_delete_inode+208/320] generic_delete_inode+0xd0/0x140
 [_atomic_dec_and_lock+49/80] _atomic_dec_and_lock+0x31/0x50
 [iput+99/144] iput+0x63/0x90
 [pg0+542235048/1068487680] afs_PutVCache+0x88/0x130 [libafs]
 [pg0+542265983/1068487680] afs_DoBulkStat+0x44f/0x1a30 [libafs]
 [pg0+542162969/1068487680] FindItem+0x69/0x120 [libafs]
 [pg0+542162195/1068487680] afs_dir_LookupOffset+0x73/0x80 [libafs]
 [pg0+542275292/1068487680] afs_lookup+0xe7c/0x146f [libafs]
 [pg0+542444397/1068487680] crget+0x4d/0xc0 [libafs]
 [pg0+542467178/1068487680] afs_linux_lookup+0x6a/0x1d0 [libafs]
 [real_lookup+209/256] real_lookup+0xd1/0x100
 [do_lookup+150/176] do_lookup+0x96/0xb0
 [link_path_walk+1695/3360] link_path_walk+0x69f/0xd20
 [pg0+542098491/1068487680] DRead+0xdb/0x400 [libafs]
 [pg0+542162768/1068487680] afs_dir_GetBlob+0x20/0x40 [libafs]
 [path_lookup+145/352] path_lookup+0x91/0x160
 [__user_walk+51/96] __user_walk+0x33/0x60
 [vfs_lstat+28/96] vfs_lstat+0x1c/0x60
 [update_atime+217/224] update_atime+0xd9/0xe0
 [sys_lstat64+27/64] sys_lstat64+0x1b/0x40
 [vfs_readdir+143/160] vfs_readdir+0x8f/0xa0
 [filldir64+0/256] filldir64+0x0/0x100
 [sys_getdents64+160/170] sys_getdents64+0xa0/0xaa
 [filldir64+0/256] filldir64+0x0/0x100
 [syscall_call+7/11] syscall_call+0x7/0xb
Code: 40 8b 1d 90 1e 4e c0 c1 e9 0c c1 e1 05 01 d9 8b 09 c1 e0 05 09 d0
69 c0 01 00 37 9e c1 e9 1d 8b 1c 8d ac 8c 4d c0 b9 20 00 00 00 <8b> 93
08 03 00 00 29 d1 8b 93 00 03 00 00 d3 e8 8d 04 40 5b 8d