[OpenAFS-devel] [PATCH] fix openafs crashes on linux
2.6.10-2.6.12, and all RHEL4 kernels
Chaskiel M Grundman
cg2v@andrew.cmu.edu
Wed, 18 Apr 2007 13:32:01 -0400
--On Wednesday, April 18, 2007 11:07:45 AM -0400 Christopher Allen Wing
<wingc@engin.umich.edu> wrote:
> GFP_NOFS tells the allocator not to recurse back into the filesystem if
> it's necessary to free up memory. However, vmalloc() does not have such
> an option. Therefore, calling osi_Alloc() to request more than a page of
> memory may end up recursing back into AFS to try to free unused inodes or
> dentries.
>
> In this case, what happened was that osi_Alloc() is called within an
> AFS_GLOCK(); osi_Alloc() calls vmalloc() which tries to free dentry
> objects, which then calls back into the AFS module. Unfortunately,
> AFS_GLOCK() is already held and we deadlock.
While your change (make osi_Alloc not run under the GLOCK) is completely
legitimate, your findings indicate a problem with the linux_alloc
implementation. I would suggest the following also be done (not in the
link-fix patch):
in the vmalloc branch of LINUX/osi_alloc.c:linux_alloc, the code should
assert if (!drop_glock && haveGlock) and drop the glock around the vmalloc
call if (drop_glock && haveGlock)
} else {
+ osi_Assert(drop_glock || !haveGlock);
+ if (drop_glock && haveGlock)
+ AFS_GUNLOCK();
new = (void *)vmalloc(asize);
+ if (drop_glock && haveGlock)
+ AFS_GLOCK();
if (new) /* piggy back alloc type */
new = (void *)(VM_TYPE | (unsigned long)new);
}
This change will not affect the current caller that sets drop_glock to 0,
since sizeof(afs_event_t) is nowhere near the PAGE_SIZE limit.