[OpenAFS] Re: (stackable file system) aufs stalls on some operations over
Fri, 20 Jun 2014 10:41:31 -0500
On Fri, 20 Jun 2014 15:55:05 +0200 (CEST)
Erik Braun <email@example.com> wrote:
> In short, the computer stalls on some file operations when stacking an
> aufs directory above an OpenAFS directory, no matter, if the OpenAFS
> is writable or not.
> The author of aufs sees the problem (specifically from his point of
> view as not an expert for OpenAFS) in lock operations in OpenAFS and
> suggested to ask the OpenAFS maintainers.
> For more information, please see the entire thread here:
[Moving to -devel, from -info]
To help anyone to not have to read through all that, a summary:
- aufs tries to copy a file from AFS into a local fs
- In order to prevent the source file from changing while being copied,
aufs locks i_mutex on the file, and then calls dentry_open
- In openafs, dentry_open goes afs_linux_open -> afs_open ->
osi_FlushPages -> osi_VM_FlushPages -> afs_linux_lock_inode, which of
course also locks i_mutex. So we deadlock.
And Erik posted an example strace here:
<http://users.minet.uni-jena.de/~erik/aufs/mv.log> with some kernel logs
and backtraces here:
That lock in osi_VM_FlushPages was added in
b0ed5a7facb1951f2f4ef8ed3da29a6a80cb7d49, but we're supposed to lock the
inode there; we can't just get rid of that. The aufs developer thinks
that us locking i_mutex in .open is unusual, and that we shouldn't do
that. I assume the other option is to flush pages in all of the other
relevant entry points (mmap, read, etc) instead, which it seems like we
already do. So maybe we do not need to FlushPages in .open on Linux? But
I'm not sure if that's a behavior change or there's some other problem.
I am also not sure why aufs doesn't just lock i_mutex after
dentry_open'ing; I don't immediately see what it would be protecting
before the dentry_open.