[OpenAFS-devel] Re: [OpenAFS] Linux kernel panic, OpenAFS client, gconf

chas williams (contractor) chas@cmf.nrl.navy.mil
Mon, 21 Jun 2004 12:30:19 -0400


[i am moving this discussion to -devel]

looking at this gconf crashing problem i see that
afs_linux_dentry_revlidate() needs to hold the big kernel lock (BKL).
however if the dentry is bad, during the shrink_dcache_parent()/d_drop()
operation we might clear the inode associated with the dentry.
this calls afs_dentry_iput() which calls osi_iput() which will try to
grab AFS_GLOCK.

well that's bad.  you could drop the AFS_GLOCK before dropping the BKL so
that everything is fine but that is probably a poor idea since you could
be change lock ordering.  after looking at this some more, i am fairly
convinced afs is getting the locks in the wrong order in the first place.

anyway, in fs/namei.c:real_lookup() we have:

                        lock_kernel();
                        result = dir->i_op->lookup(dir, dentry);
                        unlock_kernel();

and lookup() is afs_linux_lookup() which grabs AFS_GLOCK.  so its
seems clear the right order might be lock_kernel(); AFS_GLOCK(); and not
what we currently have.  if we make this change then dentry_revalidate()
is cleanly fixed by dropping AFS_GLOCK before calling shrink/d_drop
but before dropping the BKL.

comments?

[on a historical note i probably added most of the lock_kernel()'s anyway
so its my fault really.]