[OpenAFS] Best AFS performance/load/hotspot monitoring?

Alf Wachsmann alfw@slac.stanford.edu
Thu, 24 Aug 2006 13:19:49 -0700 (PDT)

On Thu, 24 Aug 2006, Jeff Blaine wrote:
> What are people using for AFS performance monitoring?

That depends on what you mean by "AFS performance".

You could write a big file into AFS space every 10 minutes
or so and measure how long it takes. Make a plot out of it.
Same for reading. We are not doing this.

> Even if it's scout or afsmonitor, please respond.

These are good if you want to monitor AFS internal statistics.
If you have ever used the infamous "meltdown" script, you might
want to monitor some of the parameters it checks for.
This is what we do.

> I would like to hear how people are using these as well (what
> makes the most sense to monitor for general purposes, etc).

I have written a Perl module to all these functions
( http://www.slac.stanford.edu/~alfw/AFS-Monitor/ )
that makes it very easy to add these metrics to Ganglia or Nagios.
We are using Nagios at SLAC because we want the alarming part.
Some of our Nagios scripts are on the above web site.

-- Alf.

  Alf Wachsmann                       | e-mail: alfw@slac.stanford.edu
  SLAC - Scientific Computing         | Phone:  +1-650-926-4802
  2575 Sand Hill Road, M/S 97         | FAX:    +1-650-926-3329
  Menlo Park, CA 94025, USA           | Office: Bldg. 50/323
                http://www.slac.stanford.edu/~alfw (PGP)