[OpenAFS-devel] patch against deadlock
Hartmut Reuter
reuter@rzg.mpg.de
Wed, 22 Feb 2006 12:29:58 +0100
I saw a deadlock between a "ls" and a "rm" command on our Regatta AIX
5.2 system which I could analyze using kdb.
It turned out that the lock order was violated in afs_vnop_remove.c:
"rm" held a dcache lock when trying to obtain a vcache lock.
"ls" held a read lock on the vcache and tried to get the dcache lock.
--- afs_vnop_remove.c.orig 2005-05-30 06:05:44.000000000 +0200
+++ afs_vnop_remove.c 2006-02-21 16:05:06.000000000 +0100
@@ -349,6 +349,8 @@
if (tvc && osi_Active(tvc)) {
/* about to delete whole file, prefetch it first */
ReleaseWriteLock(&adp->lock);
+ if (tdc)
+ ReleaseSharedLock(&tdc->lock);
ObtainWriteLock(&tvc->lock, 143);
#if defined(AFS_OSF_ENV)
afs_Wire(tvc, &treq);
@@ -357,6 +359,8 @@
#endif
ReleaseWriteLock(&tvc->lock);
ObtainWriteLock(&adp->lock, 144);
+ if (tdc)
+ ObtainSharedLock(&tdc->lock, 1638);
}
osi_dnlc_remove(adp, aname, tvc);
This diff applies to OpenAFS 1.4.0. The number 1638 should remember to
the number 638 where the lock was obtained before.
Hartmut
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
RZG (Rechenzentrum Garching) fax +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------