[OpenAFS-devel] fakestat/NeXT hack

Nickolai Zeldovich kolya@MIT.EDU
Sun, 18 Nov 2001 02:42:33 -0500


There once existed the NeXT fakestat hack from Transarc which (tried
to?) prevent the usual "things hang stat'ing /afs/*" lossage.  It'd
be nice to implement the same functionality in OpenAFS.  However I'm
having trouble figuring out how it worked.  Could someone familiar
with this hack clarify some things for me?

My main concern is the vcache entry returned upon a lookup call for a
mountpoint, and what Fid was returned in that vcache entry.  If the
mountpoint has not been evaluated, we must avoid performing a VLDB
lookup on the volume name (if it's a dead cell, the VLDB lookup will
block until it times out).  Therefore, we don't know what Fid to put
into the new vcache entry, which means that we can potentially create
multiple vcache entries for the root of the same volume.  This seems
like it could potentially lead to issues with locking and dcaches:
there are probably assumptions in the code about uniqueness of vcache
entries, and the locking order for multiple vcaches is "in increasing
vnode order".

Additionally, various vcache hash tables are based on the Fid, so the
entry will have to be rechained once we find the real Fid.  Also, all
VNOPs would have to check if the vnode's Fid has been found yet, and
if not, do EvalMountPoint, using vcache->mvid as the parent.  While
neither of these are impossible, I think this is far more complex
than the NeXT hack, from what I've heard.

An alternative implementation I've been considering would work like
this:

 * lookup() returns the mountpoint symlink vcache entry, with v_type
   fudged to VDIR.

 * getattr() fakes some reasonable-looking stat values for mvstat=2
   entries whose real volume root hasn't been found yet.

 * all other AFS VNOPs call a common function for mvstat=2 vnodes,
   which would (a) fill in vcache->mvid, like EvalMountPoint, and
   (b) return the real volume root vcache from afs_GetVCache.

This imposes a slight overhead for any operations with volume roots
(an additional GetVCache call for each VNOP).  I'm also not certain
if making mountpoint symlink vcaches VDIR instead of VLNK would break
anything..

Any comments on either the NeXT hack or my alternative scheme would
be appreciated.

-- kolya