[OpenAFS] Solaris AFS client down - why does this happen
Karl Behler
karl.behler@ipp.mpg.de
Mon, 19 Oct 2015 16:05:16 +0200
Dear All,
we experience unwanted "shutdown" events of our OpenAFS 1.6.9 clients
under Solaris 10.
Running this client since October last year without problems on ten
Solaris desktop servers which reboot regularly on weekends, we recently
had kind of crashes on nearly half of these servers in the middle of a week.
The log file (/var/adm/messages) contains kernel messages which look
like a shutdown which seems to be initiated by the afsd itself.
(In the following log the real event starts at Oct 16 11:54:47)
Oct 16 11:35:39 sxaug37 genunix: [ID 900631 kern.notice] afs: byte-range
lock/unlock ignored; make sure no one else is running this program (pid
23006 (thunderbird-bin), user 13471, fid 1108706165.12934.344145).
Oct 16 11:39:23 sxaug37 genunix: [ID 900631 kern.notice] afs: byte-range
lock/unlock ignored; make sure no one else is running this program (pid
22054 (firefox-bin), user 6570, fid 1108604831.175334.13229850).
Oct 16 11:49:23 sxaug37 last message repeated 1 time
Oct 16 11:54:47 sxaug37 genunix: [ID 146023 kern.notice] afs: WARM
Oct 16 11:54:47 sxaug37 genunix: [ID 510892 kern.notice] shutting down
of: vcaches...
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x28e2f840
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x2924b960
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x28114c00
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x27d49000
... several hundert similar messages
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x2811dbc0
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x28a53c60
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x27e10460
Oct 16 11:54:47 sxaug37 genunix: [ID 159345 kern.notice] Failed to flush
vcache 0x289fad40
Oct 16 11:54:47 sxaug37 genunix: [ID 364168 kern.notice] BkG...
Oct 16 11:54:47 sxaug37 genunix: [ID 338304 kern.notice] CB...
Oct 16 11:54:47 sxaug37 genunix: [ID 543876 kern.notice] afs...
Oct 16 11:54:47 sxaug37 genunix: [ID 229921 kern.notice] CTrunc...
Oct 16 11:54:47 sxaug37 genunix: [ID 916331 kern.notice] AFSDB...
Oct 16 11:54:47 sxaug37 genunix: [ID 196290 kern.notice] RxEvent...
Oct 16 11:54:48 sxaug37 genunix: [ID 687192 kern.notice] UnmaskRxkSignals...
Oct 16 11:54:48 sxaug37 genunix: [ID 346748 kern.notice] RxListener...
Oct 16 11:54:48 sxaug37 genunix: [ID 890369 kern.notice] NetIfPoller...
Oct 16 11:54:48 sxaug37 genunix: [ID 288918 kern.notice] WARNING: not
all blocks freed: large 0 small 217
Oct 16 11:54:48 sxaug37 genunix: [ID 646860 kern.notice] ALL allocated
tables...
Oct 16 11:54:48 sxaug37 genunix: [ID 773001 kern.notice] done
Oct 16 11:58:24 sxaug37 genunix: [ID 540533 kern.notice] ^MSunOS Release
5.10 Version Generic_150401-28 64-bit
Oct 16 11:58:24 sxaug37 genunix: [ID 282658 kern.notice] Copyright (c)
1983, 2015, Oracle and/or its affiliates. All rights reserved.
Sometimes the system reboots immediately and sometimes the system stays
in a state where all attempts to access AFS end with I/O Error.
Any idea what happens and what to do?
Best regards,
Karl
--
Dr. Karl Behler
CODAC & IT services ASDEX Upgrade
phon +49 89 3299-1351 fax 3299-961351