[OpenAFS-devel] 1.2.10 linux kernel hang during AFS backups

Joseph H. Buehler aspam@cox.net
Sat, 28 Aug 2004 16:35:11 -0400


I am having a serious problem with 3 different AFS fileservers
hanging during AFS backups and would appreciate some help from
someone who knows the linux/AFS internals.  This happens every
night basically during AFS backups on one or more of the 3
machines.

The only access to the machine when this happens is the
Alt-SysRq keys on the console.  Below are excerpts from
the output of Alt-SysRq-t (slightly mangled because it
was captured with tcpdump using netconsole).  A guess
on my part is that kswapd is deadlocked, so shown are
the three processes in Afs_Lock_Obtain (the whole
of the output is rather large to post).

The machines are all dual-processor.

Any expert opinions?
-- 
Joe Buehler

<5> kswapd        S DD360780   784     7      1
<5> Call Trace:   [<e0ed4a76>] afs_osi_Sleep [libafs-2
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0f009b2>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0e9015c>] Afs_Lock_Obtain [libafs-2.4.20-19.9-i
<5> [<e0f009b2>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<c024a766>] udp_rcv [kernel] 0x226 (0xdffefcec))
<5> [<e0e8c5af>] afs_GetDCache [libafs-2.4.20-19.9-i68
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<c01b5197>] req_new_io [kernel] 0x67 (0xdffefd34)
<5> [<c01b588b>] __make_request [kernel] 0x48b (0xdffe
<5> [<c01b4fe8>] locate_hd_struct [kernel] 0x38 (0xdff
<5> [<c01b4fe8>] locate_hd_struct [kernel] 0x38 (0xdff
<5> [<c01b5197>] req_new_io [kernel] 0x67 (0xdffefd9c)
<5> [<e0eae3ef>] afs_UFSWrite [libafs-2.4.20-19.9-i686
<5> [<c01b5b7a>] generic_make_request [kernel] 0xda (0
<5> [<e08b39d0>] lvm_push_callback [lvm-mod] 0xa0 (0xd
<5> [<e08b3585>] lvm_map [lvm-mod] 0x125 (0xdffefe50))
<5> [<e0ed88c9>] afs_linux_writepage_sync [libafs-2.4.
<5> [<c014f9f1>] try_to_unmap_one [kernel] 0x171 (0xdf
<5> [<e0eff72c>] afs_global_lock [libafs-2.4.20-19.9-i
<5> [<e0ed86b1>] afs_linux_writepage [libafs-2.4.20-19
<5> [<c01460aa>] launder_page [kernel] 0x58a (0xdffeff
<5> [<e0ed8620>] afs_linux_writepage [libafs-2.4.20-19
<5> [<c01475dc>] rebalance_dirty_zone [kernel] 0x9c (0
<5> [<c0147704>] rebalance_inactive_zone [kernel] 0xa4
<5> [<c0147798>] rebalance_inactive [kernel] 0x48 (0xd
<5> [<c01478e1>] do_try_to_free_pages_kswapd [kernel]
<5> [<c0147b73>] kswapd [kernel] 0x83 (0xdffeffd4))
<5> [<c0147af0>] kswapd [kernel] 0x0 (0xdffeffe4))
<5> [<c010759d>] kernel_thread_helper [kernel] 0x5 (0x

<5> afs_backgroun S C03BF880  3060   872      1
<5> Call Trace:   [<e0ed4a76>] afs_osi_Sleep [libafs-2
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0f009b1>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0e901fc>] Afs_Lock_Obtain [libafs-2.4.20-19.9-i
<5> [<e0f009b1>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0ed8620>] afs_linux_writepage [libafs-2.4.20-19
<5> [<e0eff72c>] afs_global_lock [libafs-2.4.20-19.9-i
<5> [<e0e95c5e>] afs_StoreAllSegments [libafs-2.4.20-1
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<c011dca6>] load_balance [kernel] 0x36 (0xd64abea
<5> [<c011dca6>] load_balance [kernel] 0x36 (0xd64abed
<5> [<c011e3ef>] schedule [kernel] 0x19f (0xd64abf0c))
<5> [<e0ead778>] afs_StoreOnLastReference [libafs-2.4.
<5> [<e0f00760>] afs_brs [libafs-2.4.20-19.9-i686.mp]
<5> [<e0e87e9a>] BStore [libafs-2.4.20-19.9-i686.mp] 0
<5> [<e0eff72c>] afs_global_lock [libafs-2.4.20-19.9-i
<5> [<e0f00760>] afs_brs [libafs-2.4.20-19.9-i686.mp]
<5> [<e0eff72c>] afs_global_lock [libafs-2.4.20-19.9-i
<5> [<e0e884fe>] afs_BackgroundDaemon [libafs-2.4.20-1
<5> [<e0f00760>] afs_brs [libafs-2.4.20-19.9-i686.mp]
<5> [<e0e88739>] shutdown_daemons [libafs-2.4.20-19.9-
<5> [<e0ed955e>] afsd_thread [libafs-2.4.20-19.9-i686.
<5> [<e0eed1e3>] .rodata.str1.1 [libafs-2.4.20-19.9-i6
<5> [<e0ed92e0>] afsd_thread [libafs-2.4.20-19.9-i686.
<5> [<c010759d>] kernel_thread_helper [kernel] 0x5 (0x

<5> netdump-serve S C03C0280     0  1033      1
<5> Call Trace:   [<e0ed4a76>] afs_osi_Sleep [libafs-2
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0f009b2>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0e9015c>] Afs_Lock_Obtain [libafs-2.4.20-19.9-i
<5> [<e0f009b2>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0eaf62f>] afs_GetVolume [libafs-2.4.20-19.9-i68
<5> [<e0e8c5af>] afs_GetDCache [libafs-2.4.20-19.9-i68
<5> [<e0f009b0>] afs_xdcache [libafs-2.4.20-19.9-i686.
<5> [<e0e9ffc4>] afs_CopyOutAttrs [libafs-2.4.20-19.9-
<5> [<e0eaf62f>] afs_GetVolume [libafs-2.4.20-19.9-i68
<5> [<e0ed55ef>] vcache2inode [libafs-2.4.20-19.9-i686
<5> [<e0eaf62f>] afs_GetVolume [libafs-2.4.20-19.9-i68
<5> [<e0e9ffc4>] afs_CopyOutAttrs [libafs-2.4.20-19.9-
<5> [<e0e9f9ef>] afs_AccessOK [libafs-2.4.20-19.9-i686
<5> [<e0ea7639>] afs_lookup [libafs-2.4.20-19.9-i686.m
<5> [<e0e9fc19>] afs_access [libafs-2.4.20-19.9-i686.m
<5> [<e0ed2834>] crget [libafs-2.4.20-19.9-i686.mp] 0x
<5> [<e0ed2800>] crget [libafs-2.4.20-19.9-i686.mp] 0x
<5> [<e0ed2834>] crget [libafs-2.4.20-19.9-i686.mp] 0x
<5> [<e0ed7a7e>] afs_linux_lookup [libafs-2.4.20-19.9-
<5> [<c01606e2>] lookup_hash [kernel] 0xc2 (0xd9453f4c
<5> [<c01610f4>] lookup_create [kernel] 0x44 (0xd9453f
<5> [<c016172d>] sys_mkdir [kernel] 0x7d (0xd9453f7c))
<5> [<c01098cf>] system_call [kernel] 0x33 (0xd9453fc0