[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Simon Wilkinson sxw@inf.ed.ac.uk
Wed, 6 May 2009 20:40:04 +0100


On 5 May 2009, at 13:43, Felix Frank wrote:
>
> The patches in RT are just variations on the theme of linux-mmap- 
> antirecursion-20081020. They prevent deadlock at the risk of data  
> loss. The fixes in RT solve a cache inconsistency, but data  
> corruption is still possible.

Just trying to clarify where we're at with this problem, as I know  
that there are people who get worried whenever they hear the words  
"data loss" (and I'm one of them!)

My understanding is that one class of problems is solved by fixing  
linux-mmap-antirecursion-20081020 with the latest patch in RT. This  
solves the deadlock, and removes one set of write corruption issues.  
So far this corruption has only been observed with applications that  
mmap a file, close it, and then write to the mmap'd chunk. Does this  
match with your testing?

Secondly, we have another issue that occurs with mmap when the file of  
the size being mmap'd is larger than the cache size. This has also  
only been observed where an application does mmap, close, write. This  
problem is currently unfixed, but has only been observed with Linux  
kernels that don't have the BDI starvation fixes. Is that a valid  
summary?

Thanks,

Simon.