[OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

Tue, 19 Nov 2019 22:38:04 +0200

On Tue, Nov 19, 2019 at 5:10 PM Ciprian Dorin Craciun
<ciprian.craciun@gmail.com> wrote:
> > # echo t > /proc/sysrq-trigger

At the following link you can find an extract of `dmesg` after the
sysrq trigger.

  https://scratchpad.volution.ro/ciprian/f89fc32a0bbd0ae6d6f3edbbc3ee111c/b9c3bc4f795bbe9e7eaca93b0a57bea0.txt

(I have filtered processes that don't have `afs` in their name, mainly
because it exposes all my workstation's processes.  However I can
provide privately a complete file.)

The following is the process which gets stuck (it took almost ~25
minutes to complete, and it is not input file related):

~~~~
gm              S    0 27572  27562 0x80000000
Call Trace:
 ? __schedule+0x2be/0x6d0
 schedule+0x39/0xa0
 afs_cv_wait+0x10a/0x300 [libafs]
 ? wake_up_q+0x60/0x60
 rxi_WriteProc+0x21d/0x410 [libafs]
 ? rxfs_storeUfsWrite+0x55/0xb0 [libafs]
 ? afs_GenericStoreProc+0x11a/0x1f0 [libafs]
 ? afs_CacheStoreDCaches+0x1a9/0x5b0 [libafs]
 ? afs_CacheStoreVCache+0x32c/0x680 [libafs]
 ? __filemap_fdatawrite_range+0xca/0x100
 ? afs_osi_Wakeup+0xb/0x60 [libafs]
 ? afs_UFSGetDSlot+0xf6/0x4f0 [libafs]
 ? afs_StoreAllSegments+0x725/0xc20 [libafs]
 ? afs_linux_flush+0x486/0x4e0 [libafs]
 ? filp_close+0x32/0x70
 ? __x64_sys_close+0x1e/0x50
 ? do_syscall_64+0x6e/0x200
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
~~~~

On a second try (that also lockups) the following is the stack-trace
(only for the blocked process) (they look almost identical):

~~~~
gm              S    0 30548  30545 0x80004000
Call Trace:
 ? __schedule+0x2be/0x6d0
 schedule+0x39/0xa0
 afs_cv_wait+0x10a/0x300 [libafs]
 ? wake_up_q+0x60/0x60
 rxi_WriteProc+0x21d/0x410 [libafs]
 ? rxfs_storeUfsWrite+0x55/0xb0 [libafs]
 ? afs_GenericStoreProc+0x11a/0x1f0 [libafs]
 ? afs_CacheStoreDCaches+0x1a9/0x5b0 [libafs]
 ? afs_CacheStoreVCache+0x32c/0x680 [libafs]
 ? __filemap_fdatawrite_range+0xca/0x100
 ? afs_osi_Wakeup+0xb/0x60 [libafs]
 ? afs_UFSGetDSlot+0xf6/0x4f0 [libafs]
 ? afs_StoreAllSegments+0x725/0xc20 [libafs]
 ? afs_linux_flush+0x486/0x4e0 [libafs]
 ? filp_close+0x32/0x70
 ? __x64_sys_close+0x1e/0x50
 ? do_syscall_64+0x6e/0x200
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
~~~~

I can reliably trigger the issue almost 50% of the times, by just
doing the following:
* remove a few files (in my case ~15) which should trigger the rebuild
of around x2;
* start the build with maximum 8 processes concurrency;
* all the processes execute similar jobs, with similarly sized inputs,
outputs and used CPU time;

Based on `htop` I would say that neither `ninja` which does the heavy
`stat`-ing, neither `gm` (an ImageMagik alternative) are
multi-threaded.

The build procedure involves the following AFS related operations:
* check if the output exists, and if so `rm`;
* create an `output.tmp` file;
* move the `output.tmp` to `output`;

No other proceses are actively using AFS (except `mc` and a couple of
`bash` which have their `cwd` into an AFS volume).  (The `[nodaemon]`
process is a simple tool that uses `prtcl (PR_SET_CHILD_SUBREAPER)` to
catch double forking processses, and also has the `cwd` into AFS.)

Hope it helps,
Ciprian.