[OpenAFS-devel] OOPS of OpenAFS 1.4.4 on Linux 2.6.18
Erland Lewin
erland@lewin.nu
Mon, 14 May 2007 22:37:02 +0200
> --On Monday, May 14, 2007 05:15:53 PM +0200 Erland Lewin
> <erland@lewin.nu> wrote:
>> sol kernel: kernel BUG at
>> /usr/src/afs/openafs-1.4.4/src/libafs/MODLOAD-2.6.18-SP/afs_dcache.c:2395!
>>
Chaskiel M Grundman replied:
> There was another report of a crash in that part of the code a few
> weeks ago. Was your cache partition full?
Yes, very possible.
> analysis and possible fixes:
>
> The comment above this block says:
>
> /* now, if code != 0, we have an error and should punt.
> * note that we have the vcache write lock, either because
> * !setLocks or slowPass.
> */
> if (code) {
>
> it turns out that this is not the case for a dynroot vcache, since the
> dynroot codepath does not retry with slowPass=1 on error conditions. I
> have two proposed possible fixes for this. One runs dynroot fetches
> with slowPass=1, since they don't have to wait for network I/O (and so
> won't be holding the write lock across a "slow" network operation).
> The other assumes that dynroot vcaches don't need all the same
> callback processing as normal vcaches and can get away with not
> getting a write lock in the error case.
Ok. I applied fix #1 as a temporary fix. Thanks for the quick help.
/Erland