Bug#143111: [OpenAFS-devel] Testing GNU findutils on AFS.... please!

James Youngman jay@gnu.org
Thu, 20 Mar 2008 00:06:53 +0000


On Wed, Mar 19, 2008 at 8:20 PM, Jeffrey Hutzelman <jhutz@cmu.edu> wrote:

>  > If you are in a mood to test things though, is oldfind's -noleaf
>  > option needed to correctly search AFS filesystems?  (without it, find
>  > assumes that directories with a link count of 2 have no
>  > subdirectories).
>
>  Without testing, yes, it is needed at least some of the time.  That
>  assumption is not valid in AFS, and neither is the assumption that when you
>  run out of link count you've run out of subdirectories.  The problem here
>  is that AFS files can be one of four kinds - files, directories, symlinks,
>  and mount points.  A mount point is a reference to the root of another
>  volume by name or volume ID.  Mount points look to clients like
>  directories, but are not counted in the containing directory's link count,
>  since they do not in fact contain a link to that directory.

This property is normally honoured by Unix (file-) systems because
filesystems can only be mounted on subdirectories in any case.  Here's
an example:

orbital:/mnt/test# stat .
  File: `.'
  Size: 1024            Blocks: 2          IO Block: 1024   directory
Device: 302h/770d       Inode: 100421      Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2008-03-19 23:35:45.000000000 +0000
Modify: 2008-03-19 23:35:45.000000000 +0000
Change: 2008-03-19 23:35:45.000000000 +0000
orbital:/mnt/test# mkdir /mnt/test/1; mount /dev/scratch/test1 /mnt/test/1
orbital:/mnt/test# stat .
  File: `.'
  Size: 1024            Blocks: 2          IO Block: 1024   directory
Device: 302h/770d       Inode: 100421      Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2008-03-19 23:35:45.000000000 +0000
Modify: 2008-03-19 23:37:11.000000000 +0000
Change: 2008-03-19 23:37:11.000000000 +0000
orbital:/mnt/test# umount /mnt/test/1
orbital:/mnt/test# stat . | grep LInks
orbital:/mnt/test# stat . | grep Links
Device: 302h/770d       Inode: 100421      Links: 3
orbital:/mnt/test#

So here you see that with /mnt/test/1 mounted, the link count of
/mnt/test is 3.   However, this link count is at 3 because with
/mnt/test/1 UNmounted, there is still a directory /mnt/test/1 (i.e.
the mount point) and its ".." entry contributes 1 toward the link
count of /mnt/test.

>  So, to a program like find, there are effectively two kinds of
>  subdirectories -- those that are included in the link count and those that
>  are not.

That's not generally true for the reason illustrated above, but I'll
assume your statement was about AFS specifically.

> You can't tell in advance which kind you'll see, so a heuristic
>  like checking to see if the rule applies to the first directory you
>  examine, which probably works fairly well for other filesystems, won't work
>  with AFS.

So is it the case that I can mount an AFS filesystem at
/afs/mumble/foo/bar without there previously existing a directory
/afs/mumble/foo?

At the moment, find (oldfind in 4.2.x and 4.3.x) relies on examining
the results of stat(2) to figure out if it should turn off the leaf
optimisation.  It makes this determination for every directory it
searches.   Supposing find knows that AFS may be in use somewhere on
the system, what is the highest performance way of determining if the
link-count assumption will hold immediately within that directory?

Is it feasible for example to assume that directories not
(canonically) beginning with /afs/ (or matching the regex ///+afs/)
simply cannot be on an AFS filesystem?

Now that I think about it, it would also be helpful to know what
common Linux AFS clients put in struct dirent.d_type for AFS
filesystem objects (files, directories, ...).  How about other Unix
systems which support both AFS and d_type?  I also understand that AFS
ACLs can sometimes allow readddir() to return a directory entry
without it being possible to lstat(2) said directory item.  Is this
the case?   What goes into d_type for such items?

James.