[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Felix Frank Felix.Frank@Desy.de
Fri, 17 Apr 2009 09:08:39 +0200 (CEST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--579669762-1984118771-1239952120=:7689
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII

On Fri, 17 Apr 2009, Felix Frank wrote:

> On Thu, 16 Apr 2009, Marc Dionne wrote:
>
>> On 04/16/2009 08:25 AM, Felix Frank wrote:
>>>>> -    if (!avc->states & CPageWrite)
>> 
>> I see a bug there - this line probably wants to be:
>>    if (!(avc->states & CPageWrite))
>> 
>> So the recursion was avoided by never actually doing anything in 
>> StoreAllSegments, since CPageWrite never got set and the condition was 
>> always false.
>
> I guess this explains why mmap was severely broken since 1.4.8

That's not all - I retried mmap_test using vanilla 1.4.10. A file of 600MB
with a 64MB disk cache is corrupted starting somewhere above 80% for me.

My suspicion is that much data never gets written to the cache (I could 
observe something like that during testing my alternative hack). Some data 
can be read OK by the local client (maybe because the local file system 
cache has it still available?)

The matter is different for multiple clients and (apparently) cache 
manager restarts before reading (but mmap_test cannot verify either).

>> With the fix above, my larger mmap test quickly runs into a deadlock again. 
>> Looks like cache_write_pages is trying to lock the page that is currently 
>> being written:
>
> I think I just reproduced :/

Even misbehave.c deadlocks the client with your fix in place (and that is
writing one byte). Are we sure the code that handles CPageWrite in
afs/LINUX/osi_vnodeops.c does what it's supposed to do?
--579669762-1984118771-1239952120=:7689
Content-Type: TEXT/PLAIN; charset=US-ASCII; name=misbehave.c
Content-Transfer-Encoding: BASE64
Content-Description: 
Content-Disposition: attachment; filename=misbehave.c

I2luY2x1ZGUgPHN0ZGlvLmg+DQojaW5jbHVkZSA8c3lzL3R5cGVzLmg+DQoj
aW5jbHVkZSA8c3lzL3N0YXQuaD4NCiNpbmNsdWRlIDxzeXMvbW1hbi5oPg0K
I2luY2x1ZGUgPHVuaXN0ZC5oPg0KI2luY2x1ZGUgPGZjbnRsLmg+DQoNCmlu
dCBtYWluKGludCBhcmdjLCBjaGFyICoqYXJndikNCnsNCiAgICBjaGFyICpm
aWxlID0gIm1hcHBlZC1maWxlLmJpbiI7DQogICAgY2hhciAqbWFwID0gTlVM
TDsNCiAgICBpbnQgZmQ7DQoNCiAgICBpZiAoIGFyZ2MgPiAxICkNCglmaWxl
ID0gYXJndlsxXTsNCg0KICAgIHByaW50ZigiVXNpbmcgZmlsZSAlcy4uLlxu
IiwgZmlsZSk7DQoNCiAgICBmZCA9IG9wZW4oZmlsZSwgT19SRFdSIHwgT19D
UkVBVCk7DQogICAgaWYgKCBmZCA9PSAtMSApIHsNCglwZXJyb3IoZmlsZSk7
DQoJcmV0dXJuIDE7DQogICAgfQ0KDQogICAgd3JpdGUoZmQsICIxXG4iLCAy
KTsNCg0KICAgIGlmICggKG1hcCA9IChjaGFyKiltbWFwKE5VTEwsIDEsIFBS
T1RfUkVBRHxQUk9UX1dSSVRFLCBNQVBfU0hBUkVELCBmZCwgMCkpIA0KCQkJ
PT0gKGNoYXIqKSAtMSApIHsNCglwZXJyb3IoIm1tYXAiKTsNCglyZXR1cm4g
MTsNCiAgICB9DQoNCiAgICBjbG9zZShmZCk7DQoNCiAgICBwcmludGYoIk1h
cHBlZCBhbmQgY2xvc2VkICVzLCBmaXJzdCBieXRlIGlzICV1Li4uXG4iLCBm
aWxlLCBtYXBbMF0pOw0KDQogICAgbWFwWzBdKys7DQoNCiAgICBwcmludGYo
IkNoYW5nZWQgZmlyc3QgYnl0ZSB0byAldSwgdW5tYXBwaW5nLi4uXG4iLCBt
YXBbMF0pOw0KDQogICAgbXVubWFwKCh2b2lkKiltYXAsIDEpOw0KDQogICAg
cmV0dXJuIDA7DQp9DQo=

--579669762-1984118771-1239952120=:7689--