[OpenAFS-devel] Re: OpenAFS client crashes on RHEL 5.10 and RHEL 6.5

Wed, 19 Mar 2014 12:13:24 -0400

On Wed, Mar 19, 2014 at 10:46:15AM -0500, Andrew Deason wrote:
> On Tue, 18 Mar 2014 16:01:26 -0400
> "J. Bruce Fields" <bfields@redhat.com> wrote:
> 
> > > CC'ing J. Bruce Fields, if he has anything to say about this.
> > > d_materialise_unique still seems like it would be problematic to me,
> > > as I mentioned earlier, since using it can result in vfs_rmdir'ing a
> > > dentry where our parent inode does not match the dentry's
> > > d_parent->d_inode, which can panic may_delete. But it's very
> > > possible I am missing something here.
> > 
> > Both the d_move case of d_materialise_unique and the vfs_rmdir take the
> > i_mutex on the parent directories involved, so they'll be serialised
> > with respect to each other and I don't think you should see that panic.
> 
> Yes, I see.
> 
> > The one nit is that d_materialise_unique depends on a trylock for that
> > so whoever does the lookup can get a spurious -EBUSY if they're unlucky.
> 
> Via __d_unalias, yeah, that looks like it would make it unsuitable. But
> the restrictions on directory loops maybe makes this unsuitable anyway.
> 
> Is the reasoning for the trylocks just lock ordering?

Yes, if we waited for those locks at this point in the code, we could
deadlock, since the lock-holder we're waiting on might be waiting on the
i_mutex of the parent directory, which we already hold.  I haven't
thought very hard about how to fix that yet.

> It seems odd to me that lookups for anything using
> d_materialise_unique can seem to just randomly fail.

Agreed, I think this hasn't been fixed just because it's very rarely hit
in practice (in the absence of multiply-linked directories like you
have).  We can reproduce the failure on NFS with artificial testing.  It
requires an uncached lookup to happen at the same time as a
cross-directory rename from another client.

> > But it sounds like you can avoid these multiply-linked directories by
> > creating mountpoints at the appropriate places, as the NFS client
> > does.
> 
> We are also pursuing this approach (again) ever since the recent RHEL
> "regression" was reported. This is very difficult to implement while
> avoiding GPLONLY symbols, though, as we must perform all of the mounts
> and umounts from userspace. It seems possible, but it just takes a lot
> of work to avoid numerous issues, and will take some time.
> 
> I don't suppose there's any way the GPLONLY restriction on those
> interfaces could change? Or do you have any other suggestions or
> anything as to an approach we should be taking?

I don't know really know that code, apologies.  My only suggestion would
be to look at the in-kernel NFS and AFS code and see how it does this.

As for GPL_ONLY symbols I think it's certainly worth compiling a list
and asking.  Obviously the kernel community (myself included!) would be
much happier to see effort invested on upstream code.  But obvious
oversights (d_materialise_unique looks like one) do get fixed.

--b.