[OpenAFS-devel] possible recursive locking detected
Chas Williams (CONTRACTOR)
chas@cmf.nrl.navy.mil
Tue, 15 Jul 2008 10:52:07 -0400
In message <20080714234544.GB16373@excalibur.hozed.org>,Troy Benjegerdes writes:
>What does anyone make of this?
>Kernel 2.6.26-rc8, openafs-cvs HEAD
>
>This particular machine used to deadlock userspace without the lock
>debugging enabled. It appears to be okay now.
i would guess that is it probably right. there might be a recursive
lock somewhere. it looks like the rx layer with the afs_mutex_enter()
pointing to one of the rx locks. this probably highlights the need
for the afs_mutex_enter() to be static inline so that the debugging
doesnt point to afs_mutex_enter().
the linux rx mutex layer has code to try to catch this so i am little
puzzled that you dont see an osi_Panic() after this.
the debug looks a bit garbled. i tried to correct for this below.
it would be nice to find out which source code line corresponds to
.rxi_ReapConnections+0x1b0. that would tell which lock is the problem.
>[ 37.850406] =============================================
>[ 37.947534] [ INFO: possible recursive locking detected ]
>[ 38.012120] 2.6.26-rc8 #10
>[ 38.044465] ---------------------------------------------
>[ 38.109049] afsd/3684 is trying to acquire lock:
>[ 38.164275] (&l->mutex){--..}, at: [<d00000000014f0f4>] .afs_mutex_enter+0x24/0x70 [libafs]
>[ 38.282318]
>[ 38.282319] but task is already holding lock:
>[ 38.352206] (&l->mutex){--..}, at: [<d00000000014f0f4>] .afs_mutex_enter+0x24/0x70 [libafs]
>[ 38.470250]
>[ 38.470251] other info that might help us debug this:
>[ 38.548459] 2 locks held by afsd/3684:
>[ 38.593282] #0: (afs_global_lock){--..}, at: [<d0000000001583fc>] .osi_linux_alloc+0x17c/0x4c0 [libafs]
>[ 38.724950] #1: (&l->mutex){--..}, at: [<d00000000014f0f4>] .afs_mutex_enter+0x24/0x70 [libafs]
>[ 38.848298]
>[ 38.848300] stack backtrace:
>[ 38.900507] Call Trace:
>[ 38.929730] [c0000007f95330f0] [c000000000011144] .show_stack+0x64/0x210ateway: 10.1.0.2 (unreliable)
>[ 39.040389] [c0000007f95331b0] [c000000000011310] .dump_stack+0x20/0x40
>[ 39.119741] [c0000007f9533230] [c000000000098c90] .__lock_acquire+0xe20/0x1270
>[ 39.206376] [c0000007f9533330] [c0000000000991b4] .lock_acquire+0xd4/0x120
>[ 39.288851] [c0000007f95333f0] [c00000000038ce3c] .mutex_lock_nested+0xfc/0x420
>[ 39.376522] [c0000007f95334f0] [d00000000014f0f4] .afs_mutex_enter+0x24/0x70 [libafs]
>[ 39.487078] [c0000007f9533570] [d0000000001462b0] .rxi_ReapConnections+0x1b0/0x4e0 [libafs]
>[ 39.603871] [c0000007f9533690] [d00000000014acec] .rx_StartServer+0xac/0xf0 [libafs]
>[ 39.713388] [c0000007f9533740] [d000000000173c58] .afs_ResourceInit+0x1b8/0x1f0 [libafs]
>[ 39.827059] [c0000007f95337d0] [d0000000001b8024] .afs_DaemonOp+0x2f4/0x310 [libafs]
>[ 39.936573] [c0000007f9533970] [d0000000001b8b00] .afs_syscall_call+0x270/0x1ce0 [libafs]
>[ 40.051290] [c0000007f9533aa0] [d00000000013ebec] .afs_syscall+0x14c/0x6a0 [libafs]
>[ 40.159763] [c0000007f9533bc0] [d00000000015b248] .afs_unlocked_ioctl+0xc8/0x110 [libafs]
>[ 40.274476] [c0000007f9533c80] [c000000000160a40] .proc_reg_compat_ioctl+0xb0/0x100 > roo
>[ 40.382952] [c0000007f9533d30] [c00000000014a5d0] .compat_sys_ioctl+0xe0/0x500
>[ 40.469585] [c0000007f9533e30] [c0000000000086d4] syscall_exit+0x0/0x40