[OpenAFS] disk cache read error in CacheItems
Martin Flemming
martin.flemming@desy.de
Fri, 26 Oct 2018 14:00:16 +0200 (CEST)
Hi and thanks for response !
In the last days we've got the idential situtation with these error-messages ...
sometimes on all machines they started to log on the same time ...
network-traffic is not extremly high ...
filesystem of the afscache is ext4 and the size 8GB
Option are : /usr/vice/etc/afsd -afsdb -dynroot -fakestat
The cacheinfo-file : /usr/vice/etc/cacheinfo : /afs:/var/cache/afs:5552000
[root@bird070 ~]# fs getcacheparms -excessive
AFS using 88% of cache blocks (4908415 of 5552000 1k blocks)
29% of the cache files (49470 of 173500 files)
afs_cacheFiles: 173500
IFFree: 124030
IFEverUsed: 9551
IFDataMod: 3
IFDirtyPages: 0
IFAnyPages: 0
IFDiscarded: 0
DCentries: 9997
0k- 4K: 267
4k- 16k: 229
16k- 64k: 9061
64k- 256k: 212
256k- 1M: 10
>=1M: 218
[root@bird070 ~]# df -i|grep cache |grep afs
/dev/sda3 512064 173599 338465 34% /var/cache/afs
[root@bird070 ~]# df -h|grep cache |grep afs
/dev/sda3 7.6G 4.7G 2.5G 66% /var/cache/afs
[root@bird058 ~]# fs getcacheparms -excessive
AFS using 86% of cache blocks (4768364 of 5552000 1k blocks)
25% of the cache files (43806 of 173500 files)
afs_cacheFiles: 173500
IFFree: 129694
IFEverUsed: 9929
IFDataMod: 2
IFDirtyPages: 0
IFAnyPages: 0
IFDiscarded: 0
DCentries: 9998
0k- 4K: 5074
4k- 16k: 1639
16k- 64k: 1728
64k- 256k: 440
256k- 1M: 115
>=1M: 1002
[root@bird652 ~]# fs getcacheparms -excessive
AFS using 89% of cache blocks (4917473 of 5552000 1k blocks)
34% of the cache files (58678 of 173500 files)
afs_cacheFiles: 173500
IFFree: 114822
IFEverUsed: 9913
IFDataMod: 0
IFDirtyPages: 0
IFAnyPages: 0
IFDiscarded: 0
DCentries: 9999
0k- 4K: 2372
4k- 16k: 4863
16k- 64k: 2047
64k- 256k: 154
256k- 1M: 78
>=1M: 485
thanks & cheers,
martin
On Tue, 23 Oct 2018, Benjamin Kaduk wrote:
> On Tue, Oct 23, 2018 at 02:14:38PM +0200, Stephan Wiesand wrote:
>>
>>> On 23. Oct 2018, at 12:16, Andreas Ladanyi <andreas.ladanyi@kit.edu> wrote:
>>>
>>>> In the last few days we've observed an increasing number of Nodes,
>>>> which are no longer be reached and have to be rebooted
>>>>
>>>> In the /var/log/messages we see a lot of lines with e.g.
>>>>
>>>> Oct 22 18:48:26 bird858 kernel: afs: disk cache read error in
>>>> CacheItems slot 25254 off 2020340/13880020 code -5/80
>>>> Oct 22 18:48:26 bird858 kernel: afs: disk cache read error in
>>>> CacheItems slot 25253 off 2020260/13880020 code -5/80
>>>> Oct 22 18:48:26 bird858 kernel: afs: disk cache read error in
>>>> CacheItems slot 25252 off 2020180/13880020 code -5/80
>>>> Oct 22 18:48:26 bird858 kernel: afs: disk cache read error in
>>>> CacheItems slot 25251 off 2020100/13880020 code -5/80
>>>>
>>>> till nothing happens anymore ...
>>>>
>>>> The clients are Centos 7.5 , 3.10.0-862.14.4.el7.x86_64, OpenAFS
>>>> 1.6.23 built 2018-09-12 (289.sl7.862.11.6@fnal.gov)
>>>>
>>>> Any hints for the possible reason ?
>>>
>>> I have the same constellation with AFS 1.6.23 client from jsbilling repo.
>>>
>>> I cant see this messages in /var/log/messages yet.
>>
>> We're running the same kernel version and the same client build (it's the SL one) on a fair number of SL 7.4 systems, and don't see these issues either.
>>
>> -5 is EIO, meaning an actual I/O error is reported.
>>
>> What's the size and type of the cache filesystems? What does "fs getcache report"? What are the afsd parameters? Could these nodes be out of space or inodes for the cache?
>
> It's also possible that the actual disk is having trouble, and/or got
> remounted RO. dmesg and/or syslog might have some clues.
>
> (Interestingly enough, we had some changes go by recently on master to make
> the error handling for certain cases in this same class more graceful (i.e.,
> fail requests but not panic), though those changes are not in 1.6.23.)
>
> -Ben
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
Gruss
Martin Flemming
______________________________________________________
Martin Flemming
DESY / IT office : Building 2b / 008a
Notkestr. 85 phone : 040 - 8998 - 4667
22603 Hamburg mail : martin.flemming@desy.de
______________________________________________________