[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above
Felix Frank
Felix.Frank@Desy.de
Wed, 15 Apr 2009 11:10:56 +0200 (CEST)
Dear all,
i managed to reproduce a cache inconsistency among two amd_rhel50 nodes
running kernel 2.6.18-128.1.6.el5, using the short program
/afs/ifh.de/user/f/ffrank/public/afs/misbehave.c.
The problem arises through changing a mmap'ed file after closing it.
Cache problems are evident with client version 1.4.10 using disk cache.
Related tests suggest that memory cache is afflicted as well, and that the
same holds true for 1.4.8.
For 1.4.7, only memory cache appears to suffer this problem.
What's more, this is not merely an inconsistency among different
machines, but restarting the cache manager on the local host will make the
file appear as in an older state as well, despite the data version
(as reported by cmdebug -long) being alright.
Some time after the offending access, the kernel module issues a
WARNING: afs_ufswr vcp=... exOrW=0
I suspect that the problem has been existence on Linux longer (some code
comments hint at tricky Linux behaviour), but has not applied to disk
cache before 1.4.8. I will start digging through the code in that
direction and post something hopefully more definite to openafs-bugs soon.
Are there any further suggestions in the meantime?
Sincerely
Felix