[OpenAFS-devel] OOPS of OpenAFS 1.4.4 on Linux 2.6.18
Chaskiel M Grundman
cg2v@andrew.cmu.edu
Mon, 14 May 2007 15:58:55 -0400
--On Monday, May 14, 2007 05:15:53 PM +0200 Erland Lewin <erland@lewin.nu>
wrote:
> sol kernel: kernel BUG at
> /usr/src/afs/openafs-1.4.4/src/libafs/MODLOAD-2.6.18-SP/afs_dcache.c:2395!
There was another report of a crash in that part of the code a few weeks
ago. Was your cache partition full?
analysis and possible fixes:
The comment above this block says:
/* now, if code != 0, we have an error and should punt.
* note that we have the vcache write lock, either because
* !setLocks or slowPass.
*/
if (code) {
it turns out that this is not the case for a dynroot vcache, since the
dynroot codepath does not retry with slowPass=1 on error conditions. I have
two proposed possible fixes for this. One runs dynroot fetches with
slowPass=1, since they don't have to wait for network I/O (and so won't be
holding the write lock across a "slow" network operation). The other
assumes that dynroot vcaches don't need all the same callback processing as
normal vcaches and can get away with not getting a write lock in the error
case.
Patch #1:
--- src/afs/afs_dcache.c 2007-05-14 12:57:29.000000000 -0400
+++ src/afs/afs_dcache.c 2007-05-14 12:10:28.000000000 -0400
@@ -1545,6 +1545,8 @@
setNewCallback = setVcacheStatus = 0;
if (setLocks) {
+ if (afs_IsDynroot(avc))
+ slowPass = 1;
if (slowPass)
ObtainWriteLock(&avc->lock, 616);
else
Patch #2
--- src/afs/afs_dcache.c 2007-05-14 12:57:29.000000000 -0400
+++ src/afs/afs_dcache.c 2007-05-14 15:50:02.000000000 -0400
@@ -2382,17 +2382,19 @@
}
ReleaseWriteLock(&tdc->lock);
afs_PutDCache(tdc);
- ObtainWriteLock(&afs_xcbhash, 454);
- afs_DequeueCallback(avc);
- avc->states &= ~(CStatd | CUnique);
- ReleaseWriteLock(&afs_xcbhash);
- if (avc->fid.Fid.Vnode & 1 || (vType(avc) == VDIR))
- osi_dnlc_purgedp(avc);
- /*
- * Locks held:
- * avc->lock(W); assert(!setLocks || slowPass)
- */
- osi_Assert(!setLocks || slowPass);
+ if (!afs_IsDynroot(avc)) {
+ ObtainWriteLock(&afs_xcbhash, 454);
+ afs_DequeueCallback(avc);
+ avc->states &= ~(CStatd | CUnique);
+ ReleaseWriteLock(&afs_xcbhash);
+ /*
+ * Locks held:
+ * avc->lock(W); assert(!setLocks || slowPass)
+ */
+ osi_Assert(!setLocks || slowPass);
+ if (avc->fid.Fid.Vnode & 1 || (vType(avc) == VDIR))
+ osi_dnlc_purgedp(avc);
+ }
tdc = NULL;
goto done;
}