[OpenAFS] fileserver goes down overnight

Harald Barth haba@kth.se
Tue, 24 Mar 2009 18:53:36 +0100 (CET)


In addition to what Russ said, the "fileserver" are in fact more than one process:

# cat BosConfig 
restarttime 16 0 0 0 0
checkbintime 16 0 0 0 0
bnode fs fs 1
parm /usr/openafs/libexec/openafs/fileserver -nojumbo -p 128 -busyat 1200 -rxpck 800 -s 2400 -l 2400 -cb 200000 -b 480 -vc 2400
parm /usr/openafs/libexec/openafs/volserver
parm /usr/openafs/libexec/openafs/salvager -datelogs -parallel all8 -orphans attach
end

fileserver and volserver should be running all the time, salavger only during salvage.

Check if fileserver process is responding:

$ rxdebug your-server 7000 -rxstats
Trying 130.237.232.204 (port 7000):
Free packets: 2151, packet reclaims: 0, calls: 45072, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
123 threads are idle
rx stats: free packets 2151, allocs 12290974, alloc-failures(rcv 0/0,send 0/0,ack 0)
   greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, selects 0, sendSelects 0
   packets read: data 10321703 ack 1181163 busy 0 abort 0 ackall 0 challenge 609 response 442 debug 7521 params 0 unused 0 unused 0 unused 0 version 0 
   other read counters: data 10321703, ack 1181140, dup 2 spurious 16 dally 7
   packets sent: data 1753185 ack 5322134 busy 0 abort 6 ackall 0 challenge 442 response 609 debug 0 params 0 unused 0 unused 0 unused 0 version 0 
   other send counters: ack 5322134, data 3506370 (not resends), resends 2688, pushed 0, acked&ignored 4092400
        (these should be small) sendFailed 0, fatalErrors 0
   Average rtt is 0.001, with 818920 samples
   Minimum rtt is 0.000, maximum is 57.448
   50 server connections, 419 client connections, 47 peer structs, 57 call structs, 49 free call structs

Check if volserver process is responding:

$ rxdebug your-file-server 7005 -rxstats
Trying 130.237.232.204 (port 7005):
Free packets: 159, packet reclaims: 0, calls: 38060, used FDs: 6
not waiting for packets.
0 calls waiting for a thread
11 threads are idle
rx stats: free packets 159, allocs 516008847, alloc-failures(rcv 0/0,send 0/0,ack 0)
   greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, selects 0, sendSelects 0
   packets read: data 451030767 ack 33304859 busy 0 abort 0 ackall 24 challenge 20 response 7525 debug 7531 params 0 unused 0 unused 0 unused 0 version 0 
   other read counters: data 451030767, ack 33304443, dup 82 spurious 416 dally 0
   packets sent: data 57914458 ack 233459787 busy 0 abort 158 ackall 0 challenge 7525 response 20 debug 0 params 0 unused 0 unused 0 unused 0 version 0 
   other send counters: ack 233459787, data 115828916 (not resends), resends 120702, pushed 0, acked&ignored 341913542
        (these should be small) sendFailed 0, fatalErrors 0
   Average rtt is 0.002, with 27662795 samples
   Minimum rtt is 0.000, maximum is 39.394
   20 server connections, 0 client connections, 10 peer structs, 147 call structs, 131 free call structs

Harald.