[OpenAFS-devel] current cvs kernel module crash on 2.6.13-rc6
Martin MOKREJŠ
mmokrejs@ribosome.natur.cuni.cz
Wed, 17 Aug 2005 14:10:37 +0200
Martin MOKREJŠ wrote:
> Hi,
> I'm testing current cvs version on linux-2.6.13-rc6 machine with 40GB cache
> partition, default "afsd -nosettime" (but with patches from Niklas and Jim) and
> # cat /usr/vice/etc/cacheinfo
> /afs:/usr/vice/cache:50000
> #
Yes, this cache was meant to be used for memcache, so now I have:
# cat /usr/vice/etc/cacheinfo
/afs:/usr/vice/cache:36533485
#
>
> I tried to copy 17GB large file from local xfs partition to the ext2 based /vicepa.
> Cache is also ext2, btw. The machine has serious problems with interactivity,
> the mouse movements stop once in a second or so, probably as a result of the
> high context switching? How can I improve the performance?
When using the maximum 85% of the cachepartition size as cachesize, I got nice numbers
out:
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 2001004 88144 121404 0 0 1344 281 434 400 2 64 27 7
0 0 0 2001004 88144 121404 0 0 0 0 312 91 0 0 100 0
0 0 0 2001004 88144 121404 0 0 0 0 582 265 0 0 100 0
0 0 0 2001004 88144 121404 0 0 0 0 498 237 0 0 100 0
0 1 0 2000128 88444 121464 0 0 360 0 390 272 1 1 57 41
0 1 0 1998516 89152 121464 0 0 708 5 491 442 0 2 0 98
0 1 0 2000212 89852 121464 0 0 700 8 490 442 0 5 0 95
0 1 0 1998600 90560 121464 0 0 708 0 488 439 0 2 0 98
0 1 0 1999572 91128 121464 0 0 696 0 486 433 0 3 0 97
1 0 0 1906688 91300 215324 0 0 31504 0 526 521 1 61 0 38
2 0 0 1792884 91452 328608 0 0 37764 0 554 563 0 71 0 29
1 0 0 1683780 91592 435792 0 0 35712 0 526 502 2 66 0 32
1 1 0 1579488 91600 540952 0 0 35072 0 525 509 1 67 0 32
1 0 0 1472600 91740 647952 0 0 35584 0 536 525 0 69 0 31
1 0 0 1366096 91884 755312 0 0 35844 0 537 528 1 67 0 32
1 0 0 1252480 92032 868760 0 0 37760 0 545 542 0 71 0 29
0 1 0 1146880 92040 974872 0 0 35456 0 528 510 2 68 0 30
1 0 0 1037472 92188 1085296 0 0 36740 8240 560 538 0 71 0 29
1 0 0 928724 92328 1191584 0 0 35456 37008 613 492 2 67 0 31
0 1 0 828840 92408 1292056 0 0 33536 32896 606 482 0 67 0 33
0 1 0 722392 92420 1398808 0 0 35584 37016 621 512 0 69 0 31
1 0 0 635088 92540 1486840 0 0 29316 28784 569 432 1 59 0 40
1 0 0 528324 92676 1591248 0 0 34816 37008 613 499 0 68 0 32
2 0 0 427404 92684 1694704 0 0 34432 32896 613 489 1 68 0 31
1 0 0 321384 92816 1798560 0 0 34688 37008 619 505 0 68 0 32
0 1 0 242196 92928 1880348 0 0 27272 25112 643 434 1 58 0 41
1 0 0 132332 93068 1987708 0 0 35712 36864 596 509 0 70 0 30
1 0 0 24260 93208 2095540 0 0 35968 36864 597 523 1 69 0 30
1 0 0 57772 86368 2109936 0 0 23424 20492 527 892 0 82 0 18
1 0 0 79368 79172 2127128 0 0 24580 24652 533 913 1 74 0 25
1 0 0 79600 69952 2162700 0 0 29824 32900 584 1003 1 81 0 18
2 0 0 79836 59452 2199092 0 0 29184 28784 566 920 1 76 0 23
and stably during the some part of the copy process (although not finished yet).
The system was fully responsive. Later on, I see
0 1 216 84616 9472 2643144 0 0 13496 16592 479 8239 5 57 0 38
0 1 216 84492 9496 2642948 0 0 12496 12944 489 7958 6 58 0 36
0 1 216 84492 9520 2643036 0 0 14108 12444 499 8418 10 60 0 30
0 1 216 84616 9548 2642692 0 0 14396 16196 505 9315 8 62 0 30
0 1 216 84496 9552 2642968 0 0 12624 12560 478 7901 4 58 0 38
1 0 216 84248 9608 2643144 0 0 15008 12696 506 8984 4 65 0 31
2 0 216 84496 9652 2642776 0 0 13492 16636 481 8058 7 59 0 34
1 1 216 84372 9644 2642356 0 0 11596 15844 484 6968 9 55 0 36
1 0 216 84248 9660 2642688 0 0 14396 8536 492 9112 9 61 0 30
2 1 216 84356 9676 2641940 0 0 15448 16628 569 9253 5 67 0 28
1 0 216 84232 9724 2641828 0 0 13364 12508 509 8175 8 58 0 34
1 0 216 84852 9748 2641080 0 0 11308 11520 483 7119 9 50 0 41
1 1 216 84728 9768 2640456 0 0 3856 50332 445 2372 2 18 0 80
1 0 216 84356 9764 2641484 0 0 2828 36 413 2026 4 13 0 83
1 1 216 84604 9784 2641460 0 0 9516 19996 464 6275 6 43 0 51
1 0 216 84356 9824 2641436 0 0 18792 116 519 11410 10 78 0 13
2 0 216 84480 9860 2641216 0 0 20148 40 520 11664 10 83 0 7
1 2 216 84232 9880 2640652 0 0 4240 79508 446 2783 5 24 0 71
0 2 216 84480 9884 2640472 0 0 1316 27712 407 1138 4 6 0 90
0 1 216 84356 9892 2640416 0 0 996 27236 412 794 2 6 0 92
0 1 216 84604 9896 2640784 0 0 1060 0 392 899 1 9 0 90
2 1 216 84480 9900 2640756 0 0 10312 604 461 6284 4 45 0 51
0 2 216 84232 9896 2640460 0 0 2380 79340 428 1576 4 15 0 81
0 1 216 84356 9900 2640316 0 0 1412 23220 397 1117 3 8 0 89
0 1 216 84356 9900 2639980 0 0 288 59500 414 525 3 4 0 93
2 1 216 84728 9900 2639980 0 0 0 6112 437 206 5 2 0 93
1 0 216 84604 9932 2640500 0 0 8328 0 439 5576 5 38 0 57
0 1 216 84604 9964 2640472 0 0 6040 15200 449 4050 4 29 0 67
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 0 216 84480 10008 2640416 0 0 16064 0 489 10658 5 71 0 24
0 1 216 84604 10080 2639804 0 0 20180 0 521 13161 6 89 0 5
0 1 216 83860 10132 2639856 0 0 15196 65608 497 10012 6 65 0 29
1 0 216 84232 10132 2639856 0 0 0 4 431 105 0 4 0 96
1 1 216 84608 10200 2639768 0 0 18740 796 505 11053 4 80 0 16
0 1 216 84732 10268 2639308 0 0 20368 0 524 12574 7 87 0 6
0 1 216 84236 10352 2639620 0 0 19728 0 522 11593 9 85 0 6
0 1 216 83864 10380 2639108 0 0 6968 65608 471 4154 4 36 0 60
1 0 216 84608 10416 2639060 0 0 6012 4 454 4294 7 29 0 64
but interactive resposniveness is still ok. At the very moment I see:
# fs getcacheparms
AFS using 16691389 of the cache's available 36533485 1K byte blocks.
aquarius ~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 261G 210G 51G 81% /
udev 1.5G 216K 1.5G 1% /dev
/dev/sda3 12G 2.8G 8.7G 25% /usr/portage
/dev/sdb1 37G 17G 19G 47% /usr/vice/cache
/dev/sdb2 37G 2.9G 32G 9% /vicepa
none 1.5G 0 1.5G 0% /dev/shm
AFS 8.6G 0 8.6G 0% /afs
#
I don't see a reason why the change of behaviour. There is enough space in the cache.
Does afsd check that the file to be copied can fit into the cache at all? I know,
when reading from pipe one has no clue of the size of the dataset, but when
reading a file ... we could take advatange of that and skip using the cache
for the process.
> Anyway, so here's what was going on when copying the 17GB file.
>
> # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 1 0 2135700 8276 680648 0 0 499 25 373 236 2 2 93 3
> 1 0 0 2082504 8352 732800 0 0 28288 21 484 462 1 54 0 45
> 1 0 0 2041948 8384 772112 0 0 4736 0 502 22645 4 92 0 4
> 1 0 0 1939648 8416 872464 0 0 33280 0 923 23195 1 87 0 12
> 1 0 0 1837100 8448 972816 0 0 33280 0 905 23292 4 86 0 10
> 1 0 0 1734552 8480 1073168 0 0 33280 0 914 23345 4 85 0 11
> 0 1 0 1632872 8516 1172520 0 0 33284 24856 972 22643 3 87 0 11
> 1 0 0 1529208 8552 1273872 0 0 33280 37340 1020 24338 3 86 0 10
> 1 0 0 1426784 8584 1374224 0 0 33280 33192 990 22727 3 87 0 11
> 1 0 0 1324236 8616 1474576 0 0 33280 37348 1010 23562 2 88 0 10
> 0 1 0 1226896 8648 1569700 0 0 33284 24892 908 19647 3 85 0 11
> 1 0 0 1164524 8652 1631048 0 0 29568 12452 538 4215 1 82 0 17
> 1 0 0 1118644 8684 1675280 0 0 3712 34200 740 23079 5 94 0 1
> 1 0 0 1016220 8716 1775632 0 0 33280 53884 1039 23539 3 86 0 11
> 1 1 0 913176 8748 1875984 0 0 33280 52188 1021 23543 3 86 0 11
> 1 0 0 864444 8748 1924368 0 0 24320 1000 665 419 0 50 0 50
> 0 1 0 820920 8776 1966884 0 0 8964 12436 462 16368 2 93 0 5
> 2 1 0 770800 8788 2015048 0 0 21120 20728 488 7154 5 81 0 14
> 1 0 0 708180 8820 2076688 0 0 12160 33160 682 23184 4 92 0 4
> 1 0 0 605384 8852 2177040 0 0 33280 33812 1166 23540 3 87 0 11
> ...
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 0 85160 7340 2669712 0 0 16672 785 759 1160 21 50 0 29
> 3 0 124 85284 7368 2668900 0 0 16644 33160 1093 23053 3 93 0 4
> 0 1 192 85408 7336 2669040 0 0 26496 8292 518 765 20 57 0 23
> 3 1 196 85316 7340 2669176 0 0 6784 29584 1219 22213 3 95 0 2
> 1 0 196 85564 7312 2669468 0 0 32256 200 816 975 3 64 0 33
> 6 0 196 85564 7344 2669364 0 0 1024 33160 773 22118 4 95 0 0
> 1 0 196 85284 7308 2670060 0 0 25304 4 473 834 21 53 0 26
> 4 0 196 85160 7328 2669984 0 0 8064 33160 1369 22442 3 94 0 3
> 2 0 196 85440 7296 2669644 0 0 32768 0 611 1113 6 69 0 25
> 3 1 196 85904 7304 2669276 0 0 772 33172 1365 22705 9 90 0 0
> 1 0 196 92724 7300 2662068 0 0 33024 0 675 1054 3 64 0 33
> 4 0 196 85284 7328 2669592 0 0 24 24872 1073 22393 4 96 0 0
> 5 0 196 85160 7300 2669924 0 0 31104 8292 612 978 5 63 0 32
> 4 0 196 85532 7328 2669528 0 0 2416 29025 1325 22574 5 92 0 3
> 1 1 196 85532 7304 2669328 0 0 13460 4148 693 732 38 34 0 28
> 2 1 196 83828 7324 2669924 0 0 20236 59703 1112 21674 3 90 0 7
> 2 1 196 85688 7304 2669516 0 0 18432 8 541 2049 2 52 0 46
> 2 0 196 85472 7332 2669592 0 0 14848 29048 926 22828 2 94 0 4
> 2 1 196 86960 7324 2668268 0 0 33280 12444 539 772 1 66 0 33
> 1 0 196 85100 7352 2669812 0 0 8 25412 721 22495 3 97 0 0