[OpenAFS] Client connection failure: bos failed to contact host's bosserver (communication failure (-1))

Ximeng (Simon) Guan xmgu@royole.com
Mon, 7 Jan 2019 20:00:27 +0000


We do have NetInfo properly set up to include the only one IP that is used.=
=20

Can the connection failure somehow come from the non-default MTU settings w=
e are using? That thing constantly bit us in the past in different places. =
We have  "-rxmaxmtu 1344" used across the board for all ptservers, vlserver=
, davolserver and dafileserver instances. I was told by the network folks t=
hat they could not manage default MTU of 1500 but has to use 1400 because o=
f the IPSec requirement...

Thank you!
Simon

-----Original Message-----
From: openafs-info-admin@openafs.org <openafs-info-admin@openafs.org> On Be=
half Of Benjamin Kaduk
Sent: Monday, January 7, 2019 11:44 AM
To: Ximeng (Simon) Guan <xmgu@royole.com>
Cc: OpenAFS-info@openafs.org
Subject: Re: [OpenAFS] Client connection failure: bos failed to contact hos=
t's bosserver (communication failure (-1))

On Mon, Jan 07, 2019 at 07:40:36PM +0000, Ximeng (Simon) Guan wrote:
> Hello,
>=20
> After a power outage on Christmas Eve which forced two database servers a=
nd all the network switches in one of our offices to re-boot, our laptop cl=
ients in that office can no longer connect to one of the AFS servers hosted=
 in the same office.
>=20
> I am leaning towards the possibility that it is a network problem instead=
 of an OpenAFS service problem because:
>=20
>   1.  Remote offices can access the full AFS space, including those volum=
es hosted on the re-booted servers.
>   2.  Between the servers there is no access problem. Nothing wrong with =
the result of "bos status", "rxdebug" or "udebug". "fs checkservers" show t=
hat all servers are running.
>   3.  On the problematic laptops "fs checkservers" show that "All servers=
 are running".
>   4.  On the problematic laptops "bos status afssrv1" returns a message:
>=20
> "bos: failed to contact host's bosserver (communications failure (-1))."
>=20
> But on the servers both in that office and in the remote offices, the sam=
e command shows that all services are up:
>=20
> "Instance ptserver, currently running normally.
>=20
> Instance vlserver, currently running normally.
>=20
> Instance buserver, currently running normally.
>=20
> Instance upserver, currently running normally.
>=20
> Instance backupusers, currently running normally.
>=20
>     Auxiliary status is: run next at Tue Jan  8 04:00:00 2019.
>=20
> Instance dafs, currently running normally.
>=20
> Auxiliary status is: file server running."
>=20
>   1.  On the problematic laptops "rxdebug afssrv1 -port 7000" returns *no=
rmal* output, for example:
>=20
> "Trying 10.12.8.33 (port 7000):
>=20
> Free packets: 2073/6357, packet reclaims: 3, calls: 81, used FDs: 36
>=20
> not waiting for packets.
>=20
> 0 calls waiting for a thread
>=20
> 125 threads are idle
>=20
> 1 calls have waited for a thread
>=20
> Connection from host 10.9.119.50, port 7001, Cuid ae06e5b3/70fe0104
>=20
>   serial 12,  natMTU 1344, security index 0, client conn
>=20
>     call 0: # 4, state dally, mode: receiving, flags: receive_done
>=20
>     call 1: # 0, state not initialized
>=20
>     call 2: # 0, state not initialized
>=20
>     call 3: # 0, state not initialized
>=20
> Connection from host 10.12.4.74, port 7001, Cuid ae06e5b3/70fe0114
>=20
>   serial 21,  natMTU 1344, security index 0, client conn
>=20
>     call 0: # 7, state dally, mode: receiving, flags: receive_done
>=20
>     call 1: # 0, state not initialized
>=20
>     call 2: # 0, state not initialized
>=20
>     call 3: # 0, state not initialized
>=20
> Done."
>=20
> I do not administer the network. Can I have some advice on how to futher =
debug the connection problem? Which udp port does the command "bos status" =
use?

My instinct would be that there is some multihoming going on and that http:=
//docs.openafs.org/Reference/5/NetRestrict.html and/or http://docs.openafs.=
org/Reference/5/NetInfo.html are not properly configured.

-Ben
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info