[AFS3-std] Re: Request for new fields in rx_statistics

Andrew Deason adeason@sinenomine.net
Missing Date


On Thu, 01 Oct 2009 01:32:49 +0200
Jeffrey Altman <jaltman@secure-endpoints.com> wrote:

> Michael Meffie wrote:
> > The number of spares is reduced to 2 integers by this change,
> > but since this is not an RX call, but rather the payload of an
> > rx debug packet, I believe addtional integers could be added
> > up to the limit imposed on the packet size.
>
> There is not method of negotiating the size of the rx_debugStats
> structure used in the RX_DEBUGI_GETSTATS debug call.  As a result it
> is not possible to increase the length of this structure and ensure
> that a peer with the new structure can send a RX_DEBUGI_GETSTATS
> response to a peer that only knows the old structure.

If I recall correctly, the code in the OpenAFS codebase just calls
recvfrom (or something similar) with a length of 1500 to get the debug
information. I haven't tried extending the structure, but I don't see
why it wouldn't work as long as you stay under 1500 octets (and are
using udp).

If the RX standard doesn't guarantee that to work, though, then fair
enough.

> As for whether or not "the number of ack packets that have been
> delayed due to excessive aborts" should be sent or not, I wonder if
> that number is really a useful number from the perspective of
> debugging the server state.  From my perspective the question that one
> is trying to answer is whether or not a particular rx_connection or
> rx_peer is being throttled.  I don't think reporting the number of
> delays will help address that question.
> 
> To address the question I have posed statistics would have to be
> collected per connection and reported in the rx_debugConn structure
> which currently has 9 spares.  The rx_debugPeer structure currently
> has 10 spares.

I would say getting a per-server statistic is certainly more useful than
the current "nothing" that we have now. But yes, having this with more
granularity would be better if it's feasible. Hmm, per-conn or per-peer?
I'd think conns provide a bit more granularity, but per-peer is probably
sufficient for debugging throttling problems, and easier to wade
through.

I would still argue that a server-wide counter would still be useful,
though. To just detect/graph if throttling is occurring at all (or to
what degree), grabbing a single server stat would seem much easier than
just enumerating all connections. And it would let you be able to see
that it occurred even after the connection/peer went away.

-- 
Andrew Deason
adeason@sinenomine.net