[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Felix Frank Felix.Frank@Desy.de
Tue, 5 May 2009 08:04:27 +0200 (CEST)


On Mon, 4 May 2009, Marc Dionne wrote:

>> Traces of the usual deadlocked suspects are attached. At that point, just
>> about any process can deadlock, I suppose. Apparently, the system ceases to
>> balance dirty pages (which appears plausible to me, but I have no experience
>> with virtual memory implementations whatsoever).
>
> Ok this brought back some memories... I think you're seeing a problem
> with older kernels that was addressed by Peter Zijlstra's "per BDI
> dirty threshold" patch set in kernel 2.6.24:
>    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=04fbfdc14e5f4
>
> Note the mention of "deadlocks with stacked BDIs", which is exactly
> the case for AFS when using a disk cache.  The congestion on the AFS
> backing device keeps processes from writing to other devices,
> including the ext2/3 device holding the disk cache.  So the cache
> manager can't make progress in writing back its dirty data.
>
> See for instance: https://bugzilla.redhat.com/show_bug.cgi?id=453811 -
> a request to backport the patch set to 2.6.18 for RHEL 5.

This doesn't look too promising - it's been in their pipe for almost a year.

> It may well be that there's no way to work around this kernel problem
> in the AFS code.

That seems likely. So this is actually good news. Nothing any of us can do,
right ;-)

Seriously though, the fix will eventually be available in RHEL (even if not
before EL6, we'll see).
Furthermore, this particular deadlock is a lot harder to reproduce than
the one fixed by the linux-mmap-antirecursion patch, and personally we never
even had problems with that one.
As such, we'd rather chance a deadlock we've never seen happen than data
corruption catching us at unawares.

For 1.4.10 testing, I'm in the process of deploying clients with
linux-mmap-antirecursion-20081020 reversed.
Needless to say, I had rather solved this thoroughly, but I lack both time and
expertise to hope for any success.

Cheers
  - Felix