[OpenAFS] Re: stackable file system above OpenAFS

Erik Braun erik.braun@uni-jena.de
Thu, 2 Oct 2014 15:41:32 +0200 (CEST)


On Thu, 25 Sep 2014, Andrew Deason wrote:

> It's been a while since I looked at this, so I may have things wrong;
> if things still hang, just give us a backtrace again.

Thank you for the patch – unfortunately it was not sufficient
(OpenAFS 1.6.9, Linux 3.16.3 as shipped with Debian Testing).
The corresponding backtrace is this:

[ 3601.900511] INFO: task kworker/1:1:58 blocked for more than 120 seconds.
[ 3601.900530]       Tainted: P         C O  3.16.3.linuxpool-jessie64 #0
[ 3601.900534] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3601.900538] kworker/1:1     D 0000000000000000  0    58      2 0x00000000
[ 3601.900557] Workqueue: aufsd wkq_func [aufs]
[ 3601.900563]  ffff88023507a190 0000000000000046 0000000000014680 ffff880235227fd8
[ 3601.900576]  0000000000014680 ffff88023507a190 ffff8801316fc4a8 ffff880235227980
[ 3601.900587]  ffff8801316fc4ac ffff88023507a190 00000000ffffffff ffff8801316fc4b0
[ 3601.900598] Call Trace:
[ 3601.900607]  [<ffffffff815a84e5>] ? schedule_preempt_disabled+0x25/0x70
[ 3601.900614]  [<ffffffff815aa343>] ? __mutex_lock_slowpath+0xd3/0x1f0
[ 3601.900620]  [<ffffffff815aa47b>] ? mutex_lock+0x1b/0x30
[ 3601.900644]  [<ffffffffa0073149>] ? osi_VM_FlushPages+0x19/0x40 [openafs]
[ 3601.900662]  [<ffffffffa003ef57>] ? osi_FlushPages+0x1e7/0x3b0 [openafs]
[ 3601.900674]  [<ffffffffa0076afd>] ? afs_linux_read_iter+0x9d/0x2b0 [openafs]
[ 3601.900681]  [<ffffffff81149656>] ? __alloc_pages_nodemask+0x176/0xc70
[ 3601.900689]  [<ffffffff811ad311>] ? new_sync_read+0x71/0xa0
[ 3601.900694]  [<ffffffff811ad9af>] ? vfs_read+0x8f/0x170
[ 3601.900703]  [<ffffffffa057d47f>] ? vfsub_read_u+0xf/0x30 [aufs]
[ 3601.900711]  [<ffffffffa057d4cf>] ? vfsub_read_k+0x2f/0x50 [aufs]
[ 3601.900719]  [<ffffffffa057e7e1>] ? au_copy_file+0x141/0x350 [aufs]
[ 3601.900785]  [<ffffffffa057eac0>] ? au_cp_regular+0xd0/0x160 [aufs]
[ 3601.900792]  [<ffffffffa057eb12>] ? au_cp_regular+0x122/0x160 [aufs]
[ 3601.900799]  [<ffffffffa057eae9>] ? au_cp_regular+0xf9/0x160 [aufs]
[ 3601.900807]  [<ffffffffa057f02a>] ? cpup_entry+0x4da/0x560 [aufs]
[ 3601.900814]  [<ffffffffa057f1fc>] ? au_cpup_single.constprop.11+0x14c/0x6d0 [aufs]
[ 3601.900822]  [<ffffffffa0584200>] ? au_lkup_dentry+0x4f0/0x520 [aufs]
[ 3601.900829]  [<ffffffff8101be85>] ? sched_clock+0x5/0x10
[ 3601.900836]  [<ffffffffa057fb6d>] ? au_cpup_simple+0x4d/0x80 [aufs]
[ 3601.900843]  [<ffffffffa057fc83>] ? au_call_cpup_simple+0x23/0x40 [aufs]
[ 3601.900850]  [<ffffffffa057c8b0>] ? wkq_func+0x10/0x50 [aufs]
[ 3601.900855]  [<ffffffff81081f74>] ? process_one_work+0x174/0x460
[ 3601.900859]  [<ffffffff810828cb>] ? worker_thread+0x6b/0x540
[ 3601.900864]  [<ffffffff81082860>] ? rescuer_thread+0x2d0/0x2d0
[ 3601.900870]  [<ffffffff81088b4f>] ? kthread+0xcf/0xf0
[ 3601.900876]  [<ffffffff81088a80>] ? kthread_create_on_node+0x180/0x180
[ 3601.900881]  [<ffffffff815abf3c>] ? ret_from_fork+0x7c/0xb0
[ 3601.900886]  [<ffffffff81088a80>] ? kthread_create_on_node+0x180/0x180
[ 3601.900910] INFO: task mv:1392 blocked for more than 120 seconds.
[ 3601.900914]       Tainted: P         C O  3.16.3.linuxpool-jessie64 #0
[ 3601.900917] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3601.900921] mv              D 0000000000000000  0  1392   1379 0x00000000
[ 3601.900931]  ffff8800b7185430 0000000000000086 0000000000014680 ffff88014da6bfd8
[ 3601.900944]  0000000000014680 ffff8800b7185430 ffff88014dbf8888 ffff88014da6bb68
[ 3601.900955]  ffff88014dbf8880 ffff8800b7185430 ffff88022a930918 ffff880235709600
[ 3601.900966] Call Trace:
[ 3601.900972]  [<ffffffff815a7499>] ? schedule_timeout+0x229/0x2a0
[ 3601.900978]  [<ffffffff815a7b9a>] ? __schedule+0x2ba/0x780
[ 3601.900984]  [<ffffffff815a8aa0>] ? wait_for_completion+0xb0/0x120
[ 3601.900989]  [<ffffffff81097970>] ? wake_up_state+0x10/0x10
[ 3601.900996]  [<ffffffffa057c9cc>] ? au_wkq_do_wait+0x7c/0xa0 [aufs]
[ 3601.901003]  [<ffffffffa057c8a0>] ? au_opt_udba+0x20/0x20 [aufs]
[ 3601.901010]  [<ffffffffa057fc60>] ? au_do_sio_cpup_simple+0xc0/0xc0 [aufs]
[ 3601.901018]  [<ffffffffa057fc24>] ? au_do_sio_cpup_simple+0x84/0xc0 [aufs]
[ 3601.901025]  [<ffffffffa05901ad>] ? aufs_rename+0x52d/0x1c40 [aufs]
[ 3601.901031]  [<ffffffff81070247>] ? ns_capable+0x27/0x60
[ 3601.901037]  [<ffffffff810702f3>] ? capable_wrt_inode_uidgid+0x53/0x60
[ 3601.901043]  [<ffffffff811b78e7>] ? generic_permission+0x107/0x1a0
[ 3601.901050]  [<ffffffff811bc062>] ? vfs_rename+0x4d2/0x7a0
[ 3601.901056]  [<ffffffff811bc829>] ? SYSC_renameat2+0x4f9/0x570
[ 3601.901061]  [<ffffffff811b27a3>] ? cp_new_stat+0x143/0x160
[ 3601.901066]  [<ffffffff811b281d>] ? SYSC_newlstat+0x1d/0x40
[ 3601.901071]  [<ffffffff815abfed>] ? system_call_fast_compare_end+0x10/0x15


When changing osi_VM_FlushPages() as follows, Aufs and OpenAFS seem to
run. I guess, that I don't need the locking, because the AFS volume is
accessed readonly and will certainly not change in my case.

# diff -u /usr/src/openafs-1.6.9/src/afs/LINUX/osi_vm.c~ /usr/src/openafs-1.6.9/src/afs/LINUX/osi_vm.c 
--- /usr/src/openafs-1.6.9/src/afs/LINUX/osi_vm.c~      2014-09-13 01:59:12.000000001 +0200
+++ /usr/src/openafs-1.6.9/src/afs/LINUX/osi_vm.c       2014-10-02 14:37:28.000000001 +0200
@@ -117,9 +117,9 @@
  {
       struct inode *ip = AFSTOV(avc);

-    afs_linux_lock_inode(ip);
+//    afs_linux_lock_inode(ip);
      truncate_inode_pages(&ip->i_data, 0);
-    afs_linux_unlock_inode(ip);
+//    afs_linux_unlock_inode(ip);
  }


Of course, I would prefer it, if everything runs out of the box.

with best regards,
Erik