[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Derrick Brashear shadow@gmail.com
Wed, 15 Apr 2009 09:07:42 -0400


On Wed, Apr 15, 2009 at 5:44 AM, Felix Frank <Felix.Frank@desy.de> wrote:
> On a hunch, I applied this to 1.4.8:
>
> --- src/afs/LINUX/osi_vm.c.orig 2009-04-15 11:37:49.000000000 +0200
> +++ src/afs/LINUX/osi_vm.c =A0 =A0 =A02009-04-15 11:38:56.000000000 +0200
> @@ -102,11 +102,6 @@ osi_VM_StoreAllSegments(struct vcache *a
> =A0{
> =A0 =A0 struct inode *ip =3D AFSTOV(avc);
>
> - =A0 =A0if (!avc->states & CPageWrite)
> - =A0 =A0 =A0 avc->states |=3D CPageWrite;
> - =A0 =A0else - =A0 =A0 =A0 return; /* someone already writing */
> -
> =A0#if LINUX_VERSION_CODE >=3D KERNEL_VERSION(2,4,5)
> =A0 =A0 /* filemap_fdatasync() only exported in 2.4.5 and above */
> =A0 =A0 ReleaseWriteLock(&avc->lock);
> @@ -120,7 +115,6 @@ osi_VM_StoreAllSegments(struct vcache *a
> =A0 =A0 AFS_GLOCK();
> =A0 =A0 ObtainWriteLock(&avc->lock, 121);
> =A0#endif
> - =A0 =A0avc->states &=3D ~CPageWrite;
> =A0}
>
> =A0/* Purge VM for a file when its callback is revoked.
>
>
> This apparently solved the problem for 1.4.8 w/ disk cache. Will try 1.4.=
10
> as well. BCC'ing openafs-bugs now.

The problem without that is a deadlock as described in RT 120491,
which means either this or that needs to be solved in another way.

Looking through my local pile of things to deal with, I see Chaskiel
commented thus:
"What's there seems like it will prevent recursion, but in a silly
way. The whole point of calling filemap_fdatawrite
is for the kernel to call writepage() on all the dirty pages. But
since osi_VM_StoreAllSegments always sets CPageWrite and CPageWrite
means writepage always returns WRITEPAGE_ACTIVATE, there's no point.
Wouldn't it be better
for a DoPartialWrite-driven StoreAllSegments to not call
osi_VM_StoreAllSegments (and restore the latter to usefulness)?"

If you wish to/can look, please do, otherwise I will as soon as I can.

Derrick