[OpenAFS-devel] Java API for AFS Admin -- CORRECTION!

Ted Anderson ota@transarc.com
Fri, 8 Mar 2002 12:17:30 -0500 (EST)


I finally got around to working through the API.  Here are my comments
on AFSLore JavaAdminAPI[1] dated 14-Feb-2002.

Internal links for these attachments don't work.  I guess that is a
problem with javaDoc.

Is Cell.getGroupNames significantly faster than just getting the list of
group objects and asking each for its name?  And if so, is it useful
very often?  Ditto for Cell.getInfoGroups and other similar methods
throughout the API.  The getGroup and getGroupNames methods claim to
return a list of currently cached groups, whereas getInfoGroups returns
all groups in the cell.  What is the significance of this distinction?

In some cells there are a great many groups and users.  I think there
are cells with upwards of 50K users.  Maybe using an iterator would be a
better approach?

In Cell.getKey, what is the key "name".  Would this normally be "afs"?
The description of "name" says "the encrypted key String of the key to
retrieve", which isn't clear.  Also the "server" argument mentions
"partition", which might be a cut-and-paste error.  Would a kvno
argument be useful for input (I see it is a property of the returned Key
object).

Maybe Cell.getKey, getPartition, getProcess would be better as Server
object functions?  Generally, I would think it would better to push
these functions further down in the object hierarchy.

There is some justification for Cell.getVolume, but it shouldn't require
server and partition parameters.  But then the Volume.getPartition
should return a list of all the partitions hosting this volume.
Probably there should be two related classes: a VLDB volume and
partition volume.  The latter would be a subclass of the former.  The
partition volumes could be obtained from the VLDB volume by asking for
the appropriate type.  The getReadonly method would return a list of
(partition) volumes.

Does the Cell.getProcess "name" argument refer to bosserver instances?

Cell.refreshTokenExpiration doesn't seem very useful.  The expiration
time doesn't change.  Maybe you mean it reauthenticates to extend the
expiration time using the (saved) password.

Group.getCreator should return a User object instead of having separate
getCreatorName and getCreatorUid methods.  Along the same lines, I'd
eliminate getGroupsOwnedNames.  Generally, there are lots of similarly
redundant methods throughout the API.  I'd favor simplifying the API by
removing these extra methods.

Group.getListAdd, for instance, returns a String, but the description
says it returns one of Group.GROUP_xxx_ACCESS, which are declared to be
static int.  I guess this is related to the getListAddHandle method, but
the distinction isn't clear.

The meaning of the Group.refresh method's "force" parameter is not
clear.

It is a little odd to have an ACL setting method in the Group and User
objects, but perhaps not terribly so.  How can I remove an ACL entry:
specify false for all the booleans?  Might it be handy to have a
permission object which conveniently encodes various bits.  This would
allow setACL to take fewer parameters and allow the Permission class to
export static objects with useful names like "none", "read", "write" and
"all".

Key.getVersion and Key.getVersionHandle seem redundant.  Surely it is
easy to get the int from an Integer.  Ditto for lots of other *Handle
methods in the API.

What happens if a key is created without a version number?  It doesn't
look like there is a method for setting it later.

Is it the case that the keyString input to Key.create is the same as
what is returned by the getEncryptionKey method?  Is this a password
which is passed through (one of the) StringToKey function(s) or is this
the raw 56 bit/8 byte DES key?  Can you make provisions for support of
the various types K5 encryption keys down the road?  Maybe there should
be a key type attribute included here for future expansion.

The Server.getBinaryRestartDay and getBinaryRestartHandle are redundant,
and the disparity doesn't exist for getBinaryRestartHour, Minute, etc.
It seems a little silly to have so many little methods for accessing
these many attributes.  Partly I guess this is a Java problem, but maybe
there's a better way?  Return an array or hash of values?

Sometimes server log files are very large.  Is there a way to return a
stream or a local file name, to avoid bundling an entire log file into a
single huge string.

How do I set the name of a Server object that was created without one?
Ditto for Process objects.

Is it possible to get the output produced by the Server.salvage method
subsequently, perhaps using something like getLog?  Ditto for the
salvage method of Partition and Volume objects.

There should be a Server.getPartition by name, instead of only a method
to return a list of all partitions.  This would be a proper replacement
for the Cell method.  Ditto for getKeys and getProcesses.  Ditto for
Partition.getVolumes.

Server.getIPAddress returns a string.  Is this in dotted quad notation?
What if a file server has multiple IP addresses.  Is the string typed in
such a way that we can handle IPv6 addresses someday?

What is the format of the time string returned by getGeneralRestartTime?
Maybe a sample string would be useful.

Server.getTotal{,Free,Used}Space seems like it should be a function of a
Partition.  The description also mentions "server" and "partition" at
different points, to further confuse the issue.  Eliminate the various
getUsedPercentage methods and let the caller do the compuation himself?

Does Server.getTotalQuota actually iterate through all the volumes, or
does the server actively track this?  If the former, then maybe the
caller should compute this number himself.

I suppose it should be obvious, but maybe it is worth making clear that
the constructors for Partition and Volume only instantiate object the
underlying entity is created by the Volume.create method.  I guess there
isn't a Partition.create method, though this would be useful in some
scenarios, though it is out of the scope of existing AFS Admin tools.

Related to my earlier comments about two volume classes, is a Volume
object localized to a server/partition?  I see that there is a moveTo
method which takes a Partition (and implicitly a server), but then why
does the constructor take a cell and server name?  How does one access
VLDB functions like vos delentry or zap?

How do I set the id of a Volume?  This is needed for restore by id,
isn't it?

It would be very nice if the User class would know how to manipulate
users in a K5 environment as well.  Many of the user attributes for
which there are lots of explicit methods, are idiosyncratic to the
KaServer.  It would be nice to abstract some of this detail somehow, so
that mapping to other authentication environments would not be too hard.
Maybe a base class that contains only the basic user object behavior,
and derrived classes that handle KaServer, KerberosVMIT, HeimdalKTH, or
ActiveDirectory users.

User.getEncryptionKey claims to return a string "in octal form".  What
does this mean?

User.setPassword should also take a StringToKey type to better handle
the present and future diversity of these functions.

I noticed there is a User.equal, but, for instance, no Group or Volume
equal methods except those inherited from the Object class.  Anything
special going on for users?

Here are some random thoughts on caching.  Each of these objects has a
welter of refresh functions.  It would be nice to have a better scheme
for managing the caching of object attributes.  In particular, it would
be desirable if there was a way to subsequently drop in a more clever
caching mechanism without changing the users of this API.  It would make
sense, I suppose to have a caching version of each object that
sub-classes the base object but adds caching.  Perhaps this is how you
have implemented it already.  From the stand-point of the API, the
problem is that sometimes a user will have out-of-band reason to want to
refresh or invalidate the cache.  As long as there is no explicit
callback-like mechanism, then any polling interval is sometimes going to
be too short and other times too long.

It would be relatively easy to imagine adding a callback interface to
the vldb, ptserver, bosserver, etc, which would keep a list of contacts
interested changes.  Perhaps this could even be an add-on server that
runs locally on the server and monitors the local file system for mtime
changes to the database files.  Fine grained callbacks could be
implemented by having this callback server make a (local) call to the
database server to see if the specific object of interest had changed.

An easier strategy would be to have variants of each function that
allows the caller to specify an acceptable age for the information.
Perhaps a better approach would be to have method for each object that
sets the allowable latency (max age) and whether to keep it up to date
in the background or whether to refetch it on demand if the cached value
is stale.  Setting it to zero would cause all query functions to go
straight to the server.  It could be adjusted up and down as needed by
the API user.

Ted Anderson

[1] http://grand.central.org/twiki/bin/view/AFSLore/JavaAdminAPI