[OpenAFS] Any xstat tools?

Mark Vitale mvitale@sinenomine.net
Mon, 29 Mar 2021 15:08:32 +0000


Chaskiel,

> On Mar 29, 2021, at 9:11 AM, Chaskiel Grundman <cgrundman@gmail.com> wrot=
e:
>=20
> Does anyone have tools for capturing, aggregating, analyzing or displayin=
g info from xstats? Not so much operation timing, though that may also be h=
elpful.

There are some patches under review on gerrit to make OpenAFS metrics more =
machine-readable:

https://gerrit.openafs.org/#/c/14359/   xstat: Add the xstat_fs_test -forma=
t option
https://gerrit.openafs.org/#/c/14358/   rxdebug: Add rxdebug -raw option

I am using these patches to collect OpenAFS cell metrics into collectd and =
display charts with graphite.
collectd works fine for the scale I need, but it does seem to drop an obser=
vation once in a while.
It can handle aggregation automatically if you configure it to do that.
graphite works fine too, but it is a little long in the tooth; if I were do=
ing this today I would try grafana
or some other alternatives.  =20

Other sites have had success using Splunk to display their charts.
I've used Splunk to troubleshoot OpenAFS performance problems and it was=20
extremely useful.  But I have not set it up myself, so I can't give any
guidance on how difficult it is to set up compared to graphite or grafana.

I am currently pulling my stats remotely by running all my xstats and rxdeb=
ug
scripts from a central collector machine, but this is not ideal.=20
I would recommend that if you are doing this at any kind of scale,
you should try to collect the stats locally (e.g. xstat_fs_test localhost) =
on
each OpenAFS machine - either via cron job or a bos bnode script - and=20
then push the stats to your collector.

Be aware that xstat_fs_test uses a new ephemeral port on each invocation;
this in turn results in a new peer on the fileserver.  It's not horrible ov=
erhead,
even if you are collecting every 60 seconds; but if you are concerned about=
 it,
you can reduce this impact by allowing xstat_fs_test to run continuously.
That is, intead of invoking it periodically with the -onceonly option, invo=
ke
it once and specify a -frequency and a -period.  In this way it will reuse
the same ephemeral port over and over, thus creating only a single peer on =
the fileserver.

The rx_peer issue is moot for the rxdebug command; although it also uses
a new ephemeral port for each invocation, the rxdebug packets are handled b=
y
the rx stack directly, without requiring an rx_call, rx_connection, or rx_p=
eer.

Regards,
--
Mark Vitale
mvitale@sinenomine.net