[OpenAFS] fileserver goes down overnight

david l goodrich dlg@dsrw.org
Tue, 24 Mar 2009 12:58:11 -0500


--6BvahUXLYAruDZOj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Mar 24, 2009 at 06:53:36PM +0100, Harald Barth wrote:
>=20
> In addition to what Russ said, the "fileserver" are in fact more than one=
 process:
>=20
> # cat BosConfig=20
[snip]
>=20
> fileserver and volserver should be running all the time, salavger only du=
ring salvage.
>=20
I saw that from the output of bos status -long, but ps showed
volserver was running.

> Check if fileserver process is responding:
>=20
> $ rxdebug your-server 7000 -rxstats
[snip]

yes, it appears to be responding.

> Check if volserver process is responding:
>=20
> $ rxdebug your-file-server 7005 -rxstats
[snip]

volserver, not so much:=20
sprawl# rxdebug localhost 7005 -rxstats
Trying 127.0.0.1 (port 7005):
getstats call failed with code -1
sprawl#

But apparently not running very well <grin>

Thanks for the tips, Harald.

What's the consensus view on this?  Since there's nothing in
VolserLog just bounce the volserver and wait for it to crap out
again tonight with debugging enabled?
  --david


> Trying 130.237.232.204 (port 7005):
> Free packets: 159, packet reclaims: 0, calls: 38060, used FDs: 6
> not waiting for packets.
> 0 calls waiting for a thread
> 11 threads are idle
> rx stats: free packets 159, allocs 516008847, alloc-failures(rcv 0/0,send=
 0/0,ack 0)
>    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, s=
elects 0, sendSelects 0
>    packets read: data 451030767 ack 33304859 busy 0 abort 0 ackall 24 cha=
llenge 20 response 7525 debug 7531 params 0 unused 0 unused 0 unused 0 vers=
ion 0=20
>    other read counters: data 451030767, ack 33304443, dup 82 spurious 416=
 dally 0
>    packets sent: data 57914458 ack 233459787 busy 0 abort 158 ackall 0 ch=
allenge 7525 response 20 debug 0 params 0 unused 0 unused 0 unused 0 versio=
n 0=20
>    other send counters: ack 233459787, data 115828916 (not resends), rese=
nds 120702, pushed 0, acked&ignored 341913542
>         (these should be small) sendFailed 0, fatalErrors 0
>    Average rtt is 0.002, with 27662795 samples
>    Minimum rtt is 0.000, maximum is 39.394
>    20 server connections, 0 client connections, 10 peer structs, 147 call=
 structs, 131 free call structs
>=20
> Harald.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info

--6BvahUXLYAruDZOj
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAknJHzMACgkQHDmo5jqnP4QoLQCfXCvBs4muNiLGJXBOKHh+xFgf
3psAn2lGHc0bG4OEICLtodMJ+gjyqSvW
=ZWNF
-----END PGP SIGNATURE-----

--6BvahUXLYAruDZOj--