[OpenAFS-devel] Re: OpenAFS kernel panic (again)

chas williams (contractor) chas@cmf.nrl.navy.mil
Thu, 29 Jul 2004 16:38:07 -0400


In message <200407290902.49762.j.pilawa@tu-bs.de>,Jan-Marc Pilawa writes:
>Chas Williams worked on that Problem and we think, that we have found the bug.

for the curious, i believe the bug is a race during afs_remunlink().
during operation afs_remunlink() raises the refcount of the inode
about to go away.  at some point afs_remunlink() is done with its
work and calls afs_PutVCache().  this does a AFS_GUNLOCK() and 
decrements the refcount.  i believe this races against afs_lookup().
it seems to me that afs_lookup() could find a vnode with a positive
refcount that is just about ready to go to 0.

i believe the right fix is to block linux lookups (cached or otherwise)
during the afs_remunlink() operation to prevent this.  i have tried
adding locking to osi_iput() but with little success (which tells me
the afs_remunlink() is happening from somewhere else) or i dont know
what should block linux lookup operations.

naturally comments from others are always interesting.