[OpenAFS-devel] "Lost contact with file server" problems

Harald Barth haba@pdc.kth.se
Wed, 07 Sep 2005 11:26:54 +0200 (MEST)


The patch that did set the calls in state error may not be the
complete way to happyness. This happened _with_ the patch that sets
the calls in error when the connection is in error. But I get
"Connection timed out" anyway :-( Nope, I don't know how
to get this behaviour "on demand".

habarber:~$ /usr/openafs/bin/fs checkv
All volumeID/name mappings checked.

habarber:~$ /usr/openafs/bin/fs checks -c pdc.kth.se
All servers are running.

BUT

habarber:~$ ls /afs
ls: /afs: Connection timed out

habarber:~$ /usr/openafs/sbin/rxdebug localhost 7001         
Trying 127.0.0.1 (port 7001):
Free packets: 130, packet reclaims: 0, calls: 105, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
Connection from host 130.237.232.194, port 7000, Cuid 932145ec/d475fdf4, error 19270408
  serial 24,  natMTU 1444, security index 2, client conn
  rxkad: level clear
  Received 0 bytes in 0 packets
  Sent 4 bytes in 1 packets
    call 0: # 9, state active, mode: error
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 194.132.192.14, port 7000, Cuid 932145ec/d475fe00, error 19270408
  serial 25,  natMTU 1444, security index 2, client conn
  rxkad: level clear
  Received 0 bytes in 0 packets
  Sent 4 bytes in 1 packets
    call 0: # 9, state active, mode: error
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Done.


And later if I look at all connections to 130.237.232.194 I have many but
not ANY of them seem to work.

habarber:~$ /usr/openafs/sbin/rxdebug localhost 7001 -all -long -onlyhost 130.237.232.194
Trying 127.0.0.1 (port 7001):
Free packets: 130, packet reclaims: 0, calls: 110, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
Showing only connections from host 130.237.232.194
Connection from host 130.237.232.194, port 7000, Cuid c31cae6c/c7c417c
  serial 0,  natMTU 1444, security index 0, client conn
    call 0: # 0, state not initialized
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 130.237.232.194, port 7000, Cuid bd0d89a5/c5ee998
  serial 6,  natMTU 1444, security index 0, server conn
    call 0: # 6, state dally, mode: eof, flags: receive_done
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 130.237.232.194, port 7000, Cuid 932145ec/d475fdd8
  serial 109,  natMTU 1444, flags pktCksum, security index 2, client conn
  rxkad: level clear, flags pktCksum
  Received 144 bytes in 18 packets
  Sent 144 bytes in 36 packets
    call 0: # 37, state not initialized
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 130.237.232.194, port 7000, Cuid 932145ec/d475fdf4, error 19270408
  serial 40,  natMTU 1444, security index 2, client conn
  rxkad: level clear
  Received 0 bytes in 0 packets
  Sent 4 bytes in 1 packets
    call 0: # 13, state not initialized
    call 1: # 2, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 130.237.232.194, port 7000, Cuid 932145ec/d475fe10
  serial 3,  natMTU 1444, flags pktCksum, security index 2, client conn
  rxkad: level clear, flags pktCksum
  Received 120 bytes in 1 packets
  Sent 16 bytes in 1 packets
    call 0: # 2, state not initialized
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Done.

cmdebug....

** Cache entry @ 0xc415fdc0 for 20.536870913.1.1 [pdc.kth.se]
    8192 bytes  DV 1331 refcnt 2
    callback 00000000   expires 1126078550
    0 opens     0 writers
    volume root
    states (0x4), read-only

But my /afs expired 50 minutes ago. So what in rx is not giving
me a new fine connection/call?

Harald.