[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Felix Frank Felix.Frank@Desy.de
Mon, 4 May 2009 17:54:13 +0200 (CEST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--579669762-1810716674-1241452355=:11630
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; FORMAT=flowed

On Mon, 27 Apr 2009, Felix Frank wrote:

>> re-entry into either writepage or entry in osi_VM_StoreAllSegments for
>> the same file if it is set.  This looks sound.  The net effect differs
>> from Chaskiel's suggestion in that it 1) disables
>> osi_Vm_StoreAllSegments on the same file for callers other than
>> doPartialWrite (probably a good idea), and 2) prevents concurrent
>> writepage calls within the same file (which might already be the
>> case).
>> 
>> Issues that remain:
>> - I think Felix still sees some deadlocks and data inconsistencies
>> with 2.6.18, but I can't reproduce with 2.6.29 or 2.6.30
>
> There is reproduceable data loss during the mmap test, its amount being
> dependent (linearly, it appears) on the size of physical memory. Corruption
> seems to start right above 1/3 memory size.
>
> Deadlocks still appear to occur above 1/2 memory size.

I've done a bunch of tests. The results are too confusing for me to even
bother you with the plots. Bottom line:
- working with mmap on a file that is smaller than the disk cache always works
- as soon as the file size exceeds the cache size, the cache gets junked
   (from earlier observations, I guess the mmap_test prog reads 0s were
    data should be)
- larger physical memory seems to enhance the chances for reading sound data,
   but that's very statistically speaking. The numbers are very controversial
   that way.

All these tests were done with a 1.4.10 client patched with
http://rt.central.org/rt/Ticket/Attachment/414217/450599/antirec-fix.patch
I disbelieve that the vanilla 1.4.10 would fare any different (if not 
statistically worse, but what does it matter?) Early attempts with it 
yielded similar levels of corruption.

The test program will still deadlock, by the way. There appears to be no 
fixed minumum file size to make it happen, but during tests, the range of 
"safe" file sizes appeared to grow in relation to the physical memory of 
the client.

Traces of the usual deadlocked suspects are attached. At that point, just
about any process can deadlock, I suppose. Apparently, the system ceases 
to balance dirty pages (which appears plausible to me, but I have no 
experience with virtual memory implementations whatsoever).

Leaving writepage prematurely seems to be unsound after all. Derrik 
suggested earlier (RT 120491) that VM handling for Linux is crooked, hence 
this whole issue.
Is that still true? Is this going to not be addressed in 1.4.x?

Cheers
  - Felix
--579669762-1810716674-1241452355=:11630
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=big-file-lockup
Content-Transfer-Encoding: BASE64
Content-Description: 
Content-Disposition: ATTACHMENT; FILENAME=big-file-lockup

cGRmbHVzaCAgICAgICBEIGZmZmY4ODAwMDIwYzU0NjAgICAgIDAgICAgODAg
ICAgICA3ICAgICAgICAgICAgODEgICAgMjQgKEwtVExCKQ0KIGZmZmY4ODAw
M2ZhNTk1MTAgIDAwMDAwMDAwMDAwMDAyNDYgIGZmZmY4ODAwMDFlMDgzNDAg
IGZmZmY4ODAwMDE2MTIzODAgDQogMDAwMDAwMDAwMDAwMDAwYSAgZmZmZjg4
MDAzZmEzMjdlMCAgZmZmZjg4MDAzZmYyMDg2MCAgMDAwMDAwMDAwMDAwMDk4
NCANCiBmZmZmODgwMDNmYTMyOWM4ICBmZmZmODgwMDNmYTIwMDQwIA0KQ2Fs
bCBUcmFjZToNCiBbPGZmZmZmZmZmODAyNjM5Zjk+XSBfc3Bpbl9sb2NrX2ly
cXNhdmUrMHg5LzB4MTQNCiBbPGZmZmZmZmZmODAyNmU3MzM+XSBkb19nZXR0
aW1lb2ZkYXkrMHgxZjgvMHgyMTMNCiBbPGZmZmZmZmZmODAyNjM5Zjk+XSBf
c3Bpbl9sb2NrX2lycXNhdmUrMHg5LzB4MTQNCiBbPGZmZmZmZmZmODAyM2Yw
MGM+XSBsb2NrX3RpbWVyX2Jhc2UrMHgxYi8weDNjDQogWzxmZmZmZmZmZjgw
MjFjYjVlPl0gX19tb2RfdGltZXIrMHhiMC8weGJlDQogWzxmZmZmZmZmZjgw
MjYyNzdjPl0gc2NoZWR1bGVfdGltZW91dCsweDhhLzB4YWQNCiBbPGZmZmZm
ZmZmODAyOTExNDE+XSBwcm9jZXNzX3RpbWVvdXQrMHgwLzB4NQ0KIFs8ZmZm
ZmZmZmY4MDI2MjBlZD5dIGlvX3NjaGVkdWxlX3RpbWVvdXQrMHg0Yi8weDc4
DQogWzxmZmZmZmZmZjgwMjNjNTVlPl0gYmxrX2Nvbmdlc3Rpb25fd2FpdCsw
eDY3LzB4ODENCiBbPGZmZmZmZmZmODAyOTlmZWM+XSBhdXRvcmVtb3ZlX3dh
a2VfZnVuY3Rpb24rMHgwLzB4MmUNCiBbPGZmZmZmZmZmODAyNTJjOWY+XSB3
cml0ZWJhY2tfaW5vZGVzKzB4YTgvMHhkOA0KIFs8ZmZmZmZmZmY4MDJiZGU2
MD5dIGJhbGFuY2VfZGlydHlfcGFnZXNfcmF0ZWxpbWl0ZWRfbnIrMHgxN2Qv
MHgxZmENCiBbPGZmZmZmZmZmODAyMTA5MGE+XSBnZW5lcmljX2ZpbGVfYnVm
ZmVyZWRfd3JpdGUrMHg1MjcvMHg2NDUNCiBbPGZmZmZmZmZmODAyMGUzZDI+
XSBjdXJyZW50X2ZzX3RpbWUrMHgzYi8weDQwDQogWzxmZmZmZmZmZjgwMzE1
YzJmPl0gYXZjX2hhc19wZXJtKzB4NDMvMHg1NQ0KIFs8ZmZmZmZmZmY4MDIx
Njg1Yz5dIF9fZ2VuZXJpY19maWxlX2Fpb193cml0ZV9ub2xvY2srMHgzNmMv
MHgzYjgNCiBbPGZmZmZmZmZmODAyMjFhMTQ+XSBnZW5lcmljX2ZpbGVfYWlv
X3dyaXRlKzB4NjUvMHhjMQ0KIFs8ZmZmZmZmZmY4ODA0YjFhMj5dIDpleHQz
OmV4dDNfZmlsZV93cml0ZSsweDE2LzB4OTENCiBbPGZmZmZmZmZmODAyMTgx
ODQ+XSBkb19zeW5jX3dyaXRlKzB4YzcvMHgxMDQNCiBbPGZmZmZmZmZmODAy
OTlmZWM+XSBhdXRvcmVtb3ZlX3dha2VfZnVuY3Rpb24rMHgwLzB4MmUNCiBb
PGZmZmZmZmZmODAyNjI5ZDY+XSBtdXRleF9sb2NrKzB4ZC8weDFkDQogWzxm
ZmZmZmZmZjgwMjE0MjEwPl0gZ2VuZXJpY19maWxlX2xsc2VlaysweDdmLzB4
OGINCiBbPGZmZmZmZmZmODgxYTlkNzY+XSA6bGliYWZzOm9zaV9yZHdyKzB4
ZWIvMHgxNTENCiBbPGZmZmZmZmZmODgxOGM1NzM+XSA6bGliYWZzOmFmc19V
RlNXcml0ZSsweDVkMC8weDg0Yg0KIFs8ZmZmZmZmZmY4ODFhYmMyYz5dIDps
aWJhZnM6YWZzX2xpbnV4X3dyaXRlcGFnZV9zeW5jKzB4MjUzLzB4M2RhDQog
WzxmZmZmZmZmZjg4MWFkYmZhPl0gOmxpYmFmczphZnNfbGludXhfd3JpdGVw
YWdlKzB4NjEvMHg4YQ0KIFs8ZmZmZmZmZmY4MDIxY2U5Yz5dIG1wYWdlX3dy
aXRlcGFnZXMrMHgxYWIvMHgzNGQNCiBbPGZmZmZmZmZmODgxYWRiOTk+XSA6
bGliYWZzOmFmc19saW51eF93cml0ZXBhZ2UrMHgwLzB4OGENCiBbPGZmZmZm
ZmZmODAyNWM5ZmI+XSBkb193cml0ZXBhZ2VzKzB4MjkvMHgyZg0KIFs8ZmZm
ZmZmZmY4MDIzMGM1Yj5dIF9fd3JpdGViYWNrX3NpbmdsZV9pbm9kZSsweDFh
ZS8weDMyOA0KIFs8ZmZmZmZmZmY4MDJiM2E3Mz5dIGRlbGF5YWNjdF9lbmQr
MHg1ZC8weDg2DQogWzxmZmZmZmZmZjgwMjIxMjBkPl0gc3luY19zYl9pbm9k
ZXMrMHgxYTkvMHgyNjcNCiBbPGZmZmZmZmZmODAyOTlkZDQ+XSBrZXZlbnRk
X2NyZWF0ZV9rdGhyZWFkKzB4MC8weGM0DQogWzxmZmZmZmZmZjgwMjUyYzc5
Pl0gd3JpdGViYWNrX2lub2RlcysweDgyLzB4ZDgNCiBbPGZmZmZmZmZmODAy
YmRmNjI+XSBiYWNrZ3JvdW5kX3dyaXRlb3V0KzB4ODUvMHhiOA0KIFs8ZmZm
ZmZmZmY4MDI1ODAxYz5dIHBkZmx1c2grMHgwLzB4MjA3DQogWzxmZmZmZmZm
ZjgwMjU4MTc1Pl0gcGRmbHVzaCsweDE1OS8weDIwNw0KIFs8ZmZmZmZmZmY4
MDJiZGVkZD5dIGJhY2tncm91bmRfd3JpdGVvdXQrMHgwLzB4YjgNCiBbPGZm
ZmZmZmZmODAyMzM0ODM+XSBrdGhyZWFkKzB4ZmUvMHgxMzINCiBbPGZmZmZm
ZmZmODAyNWZiMmM+XSBjaGlsZF9yaXArMHhhLzB4MTINCiBbPGZmZmZmZmZm
ODAyOTlkZDQ+XSBrZXZlbnRkX2NyZWF0ZV9rdGhyZWFkKzB4MC8weGM0DQog
WzxmZmZmZmZmZjgwMjZkZjAyPl0gbW9ub3RvbmljX2Nsb2NrKzB4MzUvMHg3
Yg0KIFs8ZmZmZmZmZmY4MDIzMzM4NT5dIGt0aHJlYWQrMHgwLzB4MTMyDQog
WzxmZmZmZmZmZjgwMjVmYjIyPl0gY2hpbGRfcmlwKzB4MC8weDEyDQoNCmFm
c2QgICAgICAgICAgRCBmZmZmODgwMDAyMGM1NDYwICAgICAwICAxNTM3ICAg
ICAgMSAgICAgICAgICAxNTM5ICAxNTM1IChMLVRMQikNCiBmZmZmODgwMDMx
MDEzODgwICAwMDAwMDAwMDAwMDAwMjQ2ICAwMDAwMDAwMDAwMDJiZTVkICAw
MDAwMDAwMDAwMDAwMjQ2IA0KIDAwMDAwMDAwMDAwMDAwMGEgIGZmZmY4ODAw
M2ZhMjAwNDAgIGZmZmY4ODAwM2ZhMzI3ZTAgIDAwMDAwMDAwMDAwMDBkYmUg
DQogZmZmZjg4MDAzZmEyMDIyOCAgZmZmZmZmZmY4MDRlMGE4MCANCkNhbGwg
VHJhY2U6DQogWzxmZmZmZmZmZjgwMjYzOWY5Pl0gX3NwaW5fbG9ja19pcnFz
YXZlKzB4OS8weDE0DQogWzxmZmZmZmZmZjgwMjZlNzMzPl0gZG9fZ2V0dGlt
ZW9mZGF5KzB4MWY4LzB4MjEzDQogWzxmZmZmZmZmZjgwMjYzOWY5Pl0gX3Nw
aW5fbG9ja19pcnFzYXZlKzB4OS8weDE0DQogWzxmZmZmZmZmZjgwMjNmMDBj
Pl0gbG9ja190aW1lcl9iYXNlKzB4MWIvMHgzYw0KIFs8ZmZmZmZmZmY4MDIx
Y2I1ZT5dIF9fbW9kX3RpbWVyKzB4YjAvMHhiZQ0KIFs8ZmZmZmZmZmY4MDI2
Mjc3Yz5dIHNjaGVkdWxlX3RpbWVvdXQrMHg4YS8weGFkDQogWzxmZmZmZmZm
ZjgwMjkxMTQxPl0gcHJvY2Vzc190aW1lb3V0KzB4MC8weDUNCiBbPGZmZmZm
ZmZmODAyNjIwZWQ+XSBpb19zY2hlZHVsZV90aW1lb3V0KzB4NGIvMHg3OA0K
IFs8ZmZmZmZmZmY4MDIzYzU1ZT5dIGJsa19jb25nZXN0aW9uX3dhaXQrMHg2
Ny8weDgxDQogWzxmZmZmZmZmZjgwMjk5ZmVjPl0gYXV0b3JlbW92ZV93YWtl
X2Z1bmN0aW9uKzB4MC8weDJlDQogWzxmZmZmZmZmZjgwMjUyYzlmPl0gd3Jp
dGViYWNrX2lub2RlcysweGE4LzB4ZDgNCiBbPGZmZmZmZmZmODAyYmRlNjA+
XSBiYWxhbmNlX2RpcnR5X3BhZ2VzX3JhdGVsaW1pdGVkX25yKzB4MTdkLzB4
MWZhDQogWzxmZmZmZmZmZjgwMjEwOTBhPl0gZ2VuZXJpY19maWxlX2J1ZmZl
cmVkX3dyaXRlKzB4NTI3LzB4NjQ1DQogWzxmZmZmZmZmZjgwMjJlY2EwPl0g
X193YWtlX3VwKzB4MzgvMHg0Zg0KIFs8ZmZmZmZmZmY4MDIwNzE0MT5dIGtt
ZW1fY2FjaGVfZnJlZSsweDgwLzB4ZDMNCiBbPGZmZmZmZmZmODgwMzE3YWU+
XSA6amJkOmpvdXJuYWxfc3RvcCsweDFmMy8weDFmZg0KIFs8ZmZmZmZmZmY4
MDIwZTNkMj5dIGN1cnJlbnRfZnNfdGltZSsweDNiLzB4NDANCiBbPGZmZmZm
ZmZmODgwNTQ5Nzc+XSA6ZXh0MzpfX2V4dDNfam91cm5hbF9zdG9wKzB4MWYv
MHgzZA0KIFs8ZmZmZmZmZmY4MDIxNjg1Yz5dIF9fZ2VuZXJpY19maWxlX2Fp
b193cml0ZV9ub2xvY2srMHgzNmMvMHgzYjgNCiBbPGZmZmZmZmZmODAyMjFh
MWY+XSBnZW5lcmljX2ZpbGVfYWlvX3dyaXRlKzB4NzAvMHhjMQ0KIFs8ZmZm
ZmZmZmY4MDIyMWExND5dIGdlbmVyaWNfZmlsZV9haW9fd3JpdGUrMHg2NS8w
eGMxDQogWzxmZmZmZmZmZjg4MDRiMWEyPl0gOmV4dDM6ZXh0M19maWxlX3dy
aXRlKzB4MTYvMHg5MQ0KIFs8ZmZmZmZmZmY4MDIxODE4ND5dIGRvX3N5bmNf
d3JpdGUrMHhjNy8weDEwNA0KIFs8ZmZmZmZmZmY4MDI5OWZlYz5dIGF1dG9y
ZW1vdmVfd2FrZV9mdW5jdGlvbisweDAvMHgyZQ0KIFs8ZmZmZmZmZmY4MDI2
MjlkNj5dIG11dGV4X2xvY2srMHhkLzB4MWQNCiBbPGZmZmZmZmZmODAyMTQy
MTA+XSBnZW5lcmljX2ZpbGVfbGxzZWVrKzB4N2YvMHg4Yg0KIFs8ZmZmZmZm
ZmY4ODFhOWQ3Nj5dIDpsaWJhZnM6b3NpX3Jkd3IrMHhlYi8weDE1MQ0KIFs8
ZmZmZmZmZmY4ODFhOTcxYj5dIDpsaWJhZnM6YWZzX29zaV9Xcml0ZSsweGUy
LzB4MTZmDQogWzxmZmZmZmZmZjg4MTY5MGNmPl0gOmxpYmFmczphZnNfV3Jp
dGVEQ2FjaGUrMHg5Mi8weGE3DQogWzxmZmZmZmZmZjg4MTZhYjJjPl0gOmxp
YmFmczphZnNfV3JpdGVUaHJvdWdoRFNsb3RzKzB4MWUyLzB4MzA5DQogWzxm
ZmZmZmZmZjg4MTY4MTcyPl0gOmxpYmFmczphZnNfRGFlbW9uKzB4MThhLzB4
NDc0DQogWzxmZmZmZmZmZjg4MWIyODFjPl0gOmxpYmFmczphZnNkX2xhdW5j
aGVyKzB4MC8weDJjDQogWzxmZmZmZmZmZjg4MWIyYTU2Pl0gOmxpYmFmczph
ZnNkX3RocmVhZCsweDIwZS8weDZmNw0KIFs8ZmZmZmZmZmY4MDI1ZmIyYz5d
IGNoaWxkX3JpcCsweGEvMHgxMg0KIFs8ZmZmZmZmZmY4ODFiMjgxYz5dIDps
aWJhZnM6YWZzZF9sYXVuY2hlcisweDAvMHgyYw0KIFs8ZmZmZmZmZmY4MDI2
ZGYwMj5dIG1vbm90b25pY19jbG9jaysweDM1LzB4N2INCiBbPGZmZmZmZmZm
ODgxYjI4NDg+XSA6bGliYWZzOmFmc2RfdGhyZWFkKzB4MC8weDZmNw0KIFs8
ZmZmZmZmZmY4MDI1ZmIyMj5dIGNoaWxkX3JpcCsweDAvMHgxMg0KDQptbWFw
X3Rlc3RfdGVtIEQgZmZmZjg4MDAwMjBjNTQ2MCAgICAgMCAgMjA0MiAgIDIw
NDEgICAgICAgICAgICAgICAgICAgICAoTk9UTEIpDQogZmZmZjg4MDAzNjE3
ZGJlOCAgMDAwMDAwMDAwMDAwMDI4MiAgMDAwMDAwMDAwMDAwMDI0NiAgMDAw
MDAwMDAwMDAwMDAwYSANCiAwMDAwMDAwMDAwMDAwMDA5ICBmZmZmODgwMDNm
ZjIwODYwICBmZmZmZmZmZjgwNGUwYTgwICAwMDAwMDAwMDAwMDAwYWFmIA0K
IGZmZmY4ODAwM2ZmMjBhNDggIGZmZmY4ODAwM2ZhMzI3ZTAgDQpDYWxsIFRy
YWNlOg0KIFs8ZmZmZmZmZmY4MDI2MzlmOT5dIF9zcGluX2xvY2tfaXJxc2F2
ZSsweDkvMHgxNA0KIFs8ZmZmZmZmZmY4MDI2ZTczMz5dIGRvX2dldHRpbWVv
ZmRheSsweDFmOC8weDIxMw0KIFs8ZmZmZmZmZmY4MDI2MzlmOT5dIF9zcGlu
X2xvY2tfaXJxc2F2ZSsweDkvMHgxNA0KIFs8ZmZmZmZmZmY4MDIzZjAwYz5d
IGxvY2tfdGltZXJfYmFzZSsweDFiLzB4M2MNCiBbPGZmZmZmZmZmODAyMWNi
NWU+XSBfX21vZF90aW1lcisweGIwLzB4YmUNCiBbPGZmZmZmZmZmODAyNjI3
N2M+XSBzY2hlZHVsZV90aW1lb3V0KzB4OGEvMHhhZA0KIFs8ZmZmZmZmZmY4
MDI5MTE0MT5dIHByb2Nlc3NfdGltZW91dCsweDAvMHg1DQogWzxmZmZmZmZm
ZjgwMjYyMGVkPl0gaW9fc2NoZWR1bGVfdGltZW91dCsweDRiLzB4NzgNCiBb
PGZmZmZmZmZmODAyM2M1NWU+XSBibGtfY29uZ2VzdGlvbl93YWl0KzB4Njcv
MHg4MQ0KIFs8ZmZmZmZmZmY4MDI5OWZlYz5dIGF1dG9yZW1vdmVfd2FrZV9m
dW5jdGlvbisweDAvMHgyZQ0KIFs8ZmZmZmZmZmY4MDI1MmM5Zj5dIHdyaXRl
YmFja19pbm9kZXMrMHhhOC8weGQ4DQogWzxmZmZmZmZmZjgwMmJkZTYwPl0g
YmFsYW5jZV9kaXJ0eV9wYWdlc19yYXRlbGltaXRlZF9ucisweDE3ZC8weDFm
YQ0KIFs8ZmZmZmZmZmY4MDIxMWFkMj5dIGRvX3dwX3BhZ2UrMHg2NmYvMHg2
YTMNCiBbPGZmZmZmZmZmODAyMDlhYzQ+XSBfX2hhbmRsZV9tbV9mYXVsdCsw
eDExNGIvMHgxMWY2DQogWzxmZmZmZmZmZjgwMjA2MjJhPl0gaHlwZXJjYWxs
X3BhZ2UrMHgyMmEvMHgxMDAwDQogWzxmZmZmZmZmZjgwMjYzOWY5Pl0gX3Nw
aW5fbG9ja19pcnFzYXZlKzB4OS8weDE0DQogWzxmZmZmZmZmZjgwMjY2NmVm
Pl0gZG9fcGFnZV9mYXVsdCsweGY3Yi8weDEyZTANCiBbPGZmZmZmZmZmODAy
NmRmMDI+XSBtb25vdG9uaWNfY2xvY2srMHgzNS8weDdiDQogWzxmZmZmZmZm
ZjgwMjYxZTgzPl0gdGhyZWFkX3JldHVybisweDZjLzB4MTEzDQogWzxmZmZm
ZmZmZjgwMjVmODJiPl0gZXJyb3JfZXhpdCsweDAvMHg2ZQ0KIFs8ZmZmZmZm
ZmY4MDI1ZjgyYj5dIGVycm9yX2V4aXQrMHgwLzB4NmUNCg==

--579669762-1810716674-1241452355=:11630--