[OpenAFS-devel] current cvs kernel module crash on 2.6.13-rc6

Martin MOKREJŠ mmokrejs@ribosome.natur.cuni.cz
Wed, 17 Aug 2005 14:10:37 +0200


Martin MOKREJŠ wrote:
> Hi,
>   I'm testing current cvs version on linux-2.6.13-rc6 machine with 40GB cache
> partition, default "afsd -nosettime" (but with patches from Niklas and Jim) and
> # cat /usr/vice/etc/cacheinfo
> /afs:/usr/vice/cache:50000
> #

Yes, this cache was meant to be used for memcache, so now I have:

# cat /usr/vice/etc/cacheinfo
/afs:/usr/vice/cache:36533485
#

> 
> I tried to copy 17GB large file from local xfs partition to the ext2 based /vicepa.
> Cache is also ext2, btw. The machine has serious problems with interactivity,
> the mouse movements stop once in a second or so, probably as a result of the
> high context switching? How can I improve the performance?

When using the maximum 85% of the cachepartition size as cachesize, I got nice numbers
out:

# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0      0 2001004  88144 121404    0    0  1344   281  434   400  2 64 27  7
 0  0      0 2001004  88144 121404    0    0     0     0  312    91  0  0 100  0
 0  0      0 2001004  88144 121404    0    0     0     0  582   265  0  0 100  0
 0  0      0 2001004  88144 121404    0    0     0     0  498   237  0  0 100  0
 0  1      0 2000128  88444 121464    0    0   360     0  390   272  1  1 57 41
 0  1      0 1998516  89152 121464    0    0   708     5  491   442  0  2  0 98
 0  1      0 2000212  89852 121464    0    0   700     8  490   442  0  5  0 95
 0  1      0 1998600  90560 121464    0    0   708     0  488   439  0  2  0 98
 0  1      0 1999572  91128 121464    0    0   696     0  486   433  0  3  0 97
 1  0      0 1906688  91300 215324    0    0 31504     0  526   521  1 61  0 38
 2  0      0 1792884  91452 328608    0    0 37764     0  554   563  0 71  0 29
 1  0      0 1683780  91592 435792    0    0 35712     0  526   502  2 66  0 32
 1  1      0 1579488  91600 540952    0    0 35072     0  525   509  1 67  0 32
 1  0      0 1472600  91740 647952    0    0 35584     0  536   525  0 69  0 31
 1  0      0 1366096  91884 755312    0    0 35844     0  537   528  1 67  0 32
 1  0      0 1252480  92032 868760    0    0 37760     0  545   542  0 71  0 29
 0  1      0 1146880  92040 974872    0    0 35456     0  528   510  2 68  0 30
 1  0      0 1037472  92188 1085296    0    0 36740  8240  560   538  0 71  0 29
 1  0      0 928724  92328 1191584    0    0 35456 37008  613   492  2 67  0 31
 0  1      0 828840  92408 1292056    0    0 33536 32896  606   482  0 67  0 33
 0  1      0 722392  92420 1398808    0    0 35584 37016  621   512  0 69  0 31
 1  0      0 635088  92540 1486840    0    0 29316 28784  569   432  1 59  0 40
 1  0      0 528324  92676 1591248    0    0 34816 37008  613   499  0 68  0 32
 2  0      0 427404  92684 1694704    0    0 34432 32896  613   489  1 68  0 31
 1  0      0 321384  92816 1798560    0    0 34688 37008  619   505  0 68  0 32
 0  1      0 242196  92928 1880348    0    0 27272 25112  643   434  1 58  0 41
 1  0      0 132332  93068 1987708    0    0 35712 36864  596   509  0 70  0 30
 1  0      0  24260  93208 2095540    0    0 35968 36864  597   523  1 69  0 30
 1  0      0  57772  86368 2109936    0    0 23424 20492  527   892  0 82  0 18
 1  0      0  79368  79172 2127128    0    0 24580 24652  533   913  1 74  0 25
 1  0      0  79600  69952 2162700    0    0 29824 32900  584  1003  1 81  0 18
 2  0      0  79836  59452 2199092    0    0 29184 28784  566   920  1 76  0 23

and stably during the some part of the copy process (although not finished yet).
The system was fully responsive. Later on, I see

 0  1    216  84616   9472 2643144    0    0 13496 16592  479  8239  5 57  0 38
 0  1    216  84492   9496 2642948    0    0 12496 12944  489  7958  6 58  0 36
 0  1    216  84492   9520 2643036    0    0 14108 12444  499  8418 10 60  0 30
 0  1    216  84616   9548 2642692    0    0 14396 16196  505  9315  8 62  0 30
 0  1    216  84496   9552 2642968    0    0 12624 12560  478  7901  4 58  0 38
 1  0    216  84248   9608 2643144    0    0 15008 12696  506  8984  4 65  0 31
 2  0    216  84496   9652 2642776    0    0 13492 16636  481  8058  7 59  0 34
 1  1    216  84372   9644 2642356    0    0 11596 15844  484  6968  9 55  0 36
 1  0    216  84248   9660 2642688    0    0 14396  8536  492  9112  9 61  0 30
 2  1    216  84356   9676 2641940    0    0 15448 16628  569  9253  5 67  0 28
 1  0    216  84232   9724 2641828    0    0 13364 12508  509  8175  8 58  0 34
 1  0    216  84852   9748 2641080    0    0 11308 11520  483  7119  9 50  0 41
 1  1    216  84728   9768 2640456    0    0  3856 50332  445  2372  2 18  0 80
 1  0    216  84356   9764 2641484    0    0  2828    36  413  2026  4 13  0 83
 1  1    216  84604   9784 2641460    0    0  9516 19996  464  6275  6 43  0 51
 1  0    216  84356   9824 2641436    0    0 18792   116  519 11410 10 78  0 13
 2  0    216  84480   9860 2641216    0    0 20148    40  520 11664 10 83  0  7
 1  2    216  84232   9880 2640652    0    0  4240 79508  446  2783  5 24  0 71
 0  2    216  84480   9884 2640472    0    0  1316 27712  407  1138  4  6  0 90
 0  1    216  84356   9892 2640416    0    0   996 27236  412   794  2  6  0 92
 0  1    216  84604   9896 2640784    0    0  1060     0  392   899  1  9  0 90
 2  1    216  84480   9900 2640756    0    0 10312   604  461  6284  4 45  0 51
 0  2    216  84232   9896 2640460    0    0  2380 79340  428  1576  4 15  0 81
 0  1    216  84356   9900 2640316    0    0  1412 23220  397  1117  3  8  0 89
 0  1    216  84356   9900 2639980    0    0   288 59500  414   525  3  4  0 93
 2  1    216  84728   9900 2639980    0    0     0  6112  437   206  5  2  0 93
 1  0    216  84604   9932 2640500    0    0  8328     0  439  5576  5 38  0 57
 0  1    216  84604   9964 2640472    0    0  6040 15200  449  4050  4 29  0 67
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  0    216  84480  10008 2640416    0    0 16064     0  489 10658  5 71  0 24
 0  1    216  84604  10080 2639804    0    0 20180     0  521 13161  6 89  0  5
 0  1    216  83860  10132 2639856    0    0 15196 65608  497 10012  6 65  0 29
 1  0    216  84232  10132 2639856    0    0     0     4  431   105  0  4  0 96
 1  1    216  84608  10200 2639768    0    0 18740   796  505 11053  4 80  0 16
 0  1    216  84732  10268 2639308    0    0 20368     0  524 12574  7 87  0  6
 0  1    216  84236  10352 2639620    0    0 19728     0  522 11593  9 85  0  6
 0  1    216  83864  10380 2639108    0    0  6968 65608  471  4154  4 36  0 60
 1  0    216  84608  10416 2639060    0    0  6012     4  454  4294  7 29  0 64

but interactive resposniveness is still ok. At the very moment I see:

# fs getcacheparms
AFS using 16691389 of the cache's available 36533485 1K byte blocks.
aquarius ~ # df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             261G  210G   51G  81% /
udev                  1.5G  216K  1.5G   1% /dev
/dev/sda3              12G  2.8G  8.7G  25% /usr/portage
/dev/sdb1              37G   17G   19G  47% /usr/vice/cache
/dev/sdb2              37G  2.9G   32G   9% /vicepa
none                  1.5G     0  1.5G   0% /dev/shm
AFS                   8.6G     0  8.6G   0% /afs
# 

I don't see a reason why the change of behaviour. There is enough space in the cache.
Does afsd check that the file to be copied can fit into the cache at all? I know,
when reading from pipe one has no clue of the size of the dataset, but when
reading a file ... we could take advatange of that and skip using the cache
for the process.

> Anyway, so here's what was going on when copying the 17GB file.
> 
> # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
>  0  1      0 2135700   8276 680648    0    0   499    25  373   236  2  2 93  3
>  1  0      0 2082504   8352 732800    0    0 28288    21  484   462  1 54  0 45
>  1  0      0 2041948   8384 772112    0    0  4736     0  502 22645  4 92  0  4
>  1  0      0 1939648   8416 872464    0    0 33280     0  923 23195  1 87  0 12
>  1  0      0 1837100   8448 972816    0    0 33280     0  905 23292  4 86  0 10
>  1  0      0 1734552   8480 1073168    0    0 33280     0  914 23345  4 85  0 11
>  0  1      0 1632872   8516 1172520    0    0 33284 24856  972 22643  3 87  0 11
>  1  0      0 1529208   8552 1273872    0    0 33280 37340 1020 24338  3 86  0 10
>  1  0      0 1426784   8584 1374224    0    0 33280 33192  990 22727  3 87  0 11
>  1  0      0 1324236   8616 1474576    0    0 33280 37348 1010 23562  2 88  0 10
>  0  1      0 1226896   8648 1569700    0    0 33284 24892  908 19647  3 85  0 11
>  1  0      0 1164524   8652 1631048    0    0 29568 12452  538  4215  1 82  0 17
>  1  0      0 1118644   8684 1675280    0    0  3712 34200  740 23079  5 94  0  1
>  1  0      0 1016220   8716 1775632    0    0 33280 53884 1039 23539  3 86  0 11
>  1  1      0 913176   8748 1875984    0    0 33280 52188 1021 23543  3 86  0 11
>  1  0      0 864444   8748 1924368    0    0 24320  1000  665   419  0 50  0 50
>  0  1      0 820920   8776 1966884    0    0  8964 12436  462 16368  2 93  0  5
>  2  1      0 770800   8788 2015048    0    0 21120 20728  488  7154  5 81  0 14
>  1  0      0 708180   8820 2076688    0    0 12160 33160  682 23184  4 92  0  4
>  1  0      0 605384   8852 2177040    0    0 33280 33812 1166 23540  3 87  0 11
> ...
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
>  1  0      0  85160   7340 2669712    0    0 16672   785  759  1160 21 50  0 29
>  3  0    124  85284   7368 2668900    0    0 16644 33160 1093 23053  3 93  0  4
>  0  1    192  85408   7336 2669040    0    0 26496  8292  518   765 20 57  0 23
>  3  1    196  85316   7340 2669176    0    0  6784 29584 1219 22213  3 95  0  2
>  1  0    196  85564   7312 2669468    0    0 32256   200  816   975  3 64  0 33
>  6  0    196  85564   7344 2669364    0    0  1024 33160  773 22118  4 95  0  0
>  1  0    196  85284   7308 2670060    0    0 25304     4  473   834 21 53  0 26
>  4  0    196  85160   7328 2669984    0    0  8064 33160 1369 22442  3 94  0  3
>  2  0    196  85440   7296 2669644    0    0 32768     0  611  1113  6 69  0 25
>  3  1    196  85904   7304 2669276    0    0   772 33172 1365 22705  9 90  0  0
>  1  0    196  92724   7300 2662068    0    0 33024     0  675  1054  3 64  0 33
>  4  0    196  85284   7328 2669592    0    0    24 24872 1073 22393  4 96  0  0
>  5  0    196  85160   7300 2669924    0    0 31104  8292  612   978  5 63  0 32
>  4  0    196  85532   7328 2669528    0    0  2416 29025 1325 22574  5 92  0  3
>  1  1    196  85532   7304 2669328    0    0 13460  4148  693   732 38 34  0 28
>  2  1    196  83828   7324 2669924    0    0 20236 59703 1112 21674  3 90  0  7
>  2  1    196  85688   7304 2669516    0    0 18432     8  541  2049  2 52  0 46
>  2  0    196  85472   7332 2669592    0    0 14848 29048  926 22828  2 94  0  4
>  2  1    196  86960   7324 2668268    0    0 33280 12444  539   772  1 66  0 33
>  1  0    196  85100   7352 2669812    0    0     8 25412  721 22495  3 97  0  0