[OpenAFS-devel] OOPS of OpenAFS 1.4.4 on Linux 2.6.18

Erland Lewin erland@lewin.nu
Mon, 14 May 2007 22:37:02 +0200


> --On Monday, May 14, 2007 05:15:53 PM +0200 Erland Lewin 
> <erland@lewin.nu> wrote:
>> sol kernel: kernel BUG at
>> /usr/src/afs/openafs-1.4.4/src/libafs/MODLOAD-2.6.18-SP/afs_dcache.c:2395! 
>>
Chaskiel M Grundman replied:
> There was another report of a crash in that part of the code a few 
> weeks ago. Was your cache partition full?
Yes, very possible.
> analysis and possible fixes:
>
> The comment above this block says:
>
>        /* now, if code != 0, we have an error and should punt.
>         * note that we have the vcache write lock, either because
>         * !setLocks or slowPass.
>         */
>        if (code) {
>
> it turns out that this is not the case for a dynroot vcache, since the 
> dynroot codepath does not retry with slowPass=1 on error conditions. I 
> have two proposed possible fixes for this. One runs dynroot fetches 
> with slowPass=1, since they don't have to wait for network I/O (and so 
> won't be holding the write lock across a "slow" network operation). 
> The other assumes that dynroot vcaches don't need all the same 
> callback processing as normal vcaches and can get away with not 
> getting a write lock in the error case.
Ok. I applied fix #1 as a temporary fix. Thanks for the quick help.

/Erland