[OpenAFS-devel] Adding an "estimate" function to vlserver

Wed, 2 Apr 2003 12:55:38 -0500 (EST)

Greetings,

We are looking to add a new feature to the vlserver but would like
to get some design feedback before final coding and submitting a patch.

This is related to our work building an interface between AFS and the
Amanda backup system.  Part of the amanda process is to first run an
"estimate" phase before actually running the real backup.  The results
of the estimate phase are then used in a planning phase to decide what
dump levels to run on each filesystem being backed up.

AFS has vos dump, which is useful for doing the actual volume dumps,
but there is no easy way to get an accurate estimate of the dump size,
short of running the dump into /dev/null and counting how many bytes
went by.  We could actually do it this way but it's time-consuming
and resource wasteful.  We could also just take the size from vos
examine, but some quick experiments with this showed inacuracies as
large as 4%, which we feel is unacceptable.

In order to get a first-order project designed, coded, and functioning,
we initially wrote a standalone utility that produces an accurate
estimate by parsing the volume format and adding up the appropriate
values.  This utility is very fast and produces an exactly accurate
result, but it has a couple of drawbacks.  First it has to run as root
on the fileserver in order to access the on-disk volume, and second
it depends on knowing the volume format.  If the format were ever to
change, our utility would produce erroneous results if it were not kept
current with the volume format.

For these reasons we feel the best approach is to add an "estimate"
function to the volserver, and to its vos interface.  This would allow
estimates to be obtained on any client (with appropriate user
credentials), would eliminate the separate root process on the
filserver, and would put the estimating code in the AFS source tree
where it can most easily be kept in sync with any future volume format
changes.

Moving to implementation details, our first thought was to call this
new function "vos dump -estimate" and we've gone ahead and coded up
a working implementation of this, but in hindsight we think perhaps
this is not the best choice and we should make it "vos estimate"
instead.  The "who really cares which way you do it?" answer turns out
to be hiding in the coding details.

Adding a -estimate flag to vos dump required adding an extra parameter
to the existing vos dump RPC call, which breaks backward compatibility
with existing vlservers and vos command suites.  This seems like a bad
thing to do when it can be easily avoided.

Adding a new vos command "vos estimate" leaves vos dump as is so it
remains backward compatible.  Instead it requires adding a new RPC
opcode.  We are currently inclined to proceed in this direction, but
would like to pause for a moment and submit some questions and gather
feedback before going any further with this.

Questions:

1. Are there any objections to adding this new feature to vlserver?

2. If yes, is it agreeable that the "vos estimate" approach is the
preferred one to proceed with?

3. If yes again, is there any preference what RPC opcode to use next,
or should we just take the next free one and go with it?

Thank you for your time if you've read this far.  Apologies for being
a bit long-winded but it seemed worth describing the details of the
design process in order to inform the ensuing discussion.

-Mitch