[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above
Felix Frank
Felix.Frank@Desy.de
Wed, 27 May 2009 09:16:04 +0200
Felix Frank wrote (Thu May 07 2009 08:52:41 GMT+0200 (CEST))
>>> Secondly, we have another issue that occurs with mmap when the file
>>> of the size being mmap'd is larger than the cache size. This has also
>>> only been observed where an application does mmap, close, write. This
>>> problem is currently unfixed, but has only been observed with Linux
>>> kernels that don't have the BDI starvation fixes. Is that a valid
>>> summary?
>>
>> Exactly (almost), but for this to work, the file needs not even be closed
>> prior to writing.
>
> Attached is yet another version
> of the test program. It reproduces errors in Linux 2.6.18-128.1.6.el5xen
> with
> 50 MB disk cache (test file is 120MB).
>
> It will write using mmap, then unmap, then mmap again to read data back.
> When invoked with -a, the call posix_fadvise(fd, 0, SIZE,
> POSIX_FADV_DONTNEED)
> prior to remapping is suppressed. I guess it tells the kernel to discard
> any
> VM pages that hold data from the file of fd. So with -a, the program
> runs fine
> even in AFS, but that's cheating: Running with -r afterwards reveals that
> some data has not got written to the cache after all.
>
> So yes, with the mentioned kernel there is possible data loss with files
> larger
> than the cache (although 100MB file vs 50MB cache seems to work).
> I'd be interested if that's reproduceable with newer kernels.
I finally got a hold of a box that I could setup with a different
distro. I'm sad to report that I managed to reproduce the faulty
behaviour on Linux 2.6.29-4.slh.1-sidux-amd64.
Regards
- Felix