[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Felix Frank Felix.Frank@Desy.de
Wed, 15 Apr 2009 11:10:56 +0200 (CEST)


Dear all,

i managed to reproduce a cache inconsistency among two amd_rhel50 nodes 
running kernel 2.6.18-128.1.6.el5, using the short program 
/afs/ifh.de/user/f/ffrank/public/afs/misbehave.c.
The problem arises through changing a mmap'ed file after closing it.

Cache problems are evident with client version 1.4.10 using disk cache.
Related tests suggest that memory cache is afflicted as well, and that the 
same holds true for 1.4.8.
For 1.4.7, only memory cache appears to suffer this problem.

What's more, this is not merely an inconsistency among different 
machines, but restarting the cache manager on the local host will make the 
file appear as in an older state as well, despite the data version
(as reported by cmdebug -long) being alright.

Some time after the offending access, the kernel module issues a
WARNING: afs_ufswr vcp=... exOrW=0

I suspect that the problem has been existence on Linux longer (some code 
comments hint at tricky Linux behaviour), but has not applied to disk 
cache before 1.4.8. I will start digging through the code in that 
direction and post something hopefully more definite to openafs-bugs soon.

Are there any further suggestions in the meantime?

Sincerely
Felix