[OpenAFS] Linux client and find command in AFS

Garance A Drosihn drosih@rpi.edu
Thu, 7 Apr 2005 01:04:49 -0500


At 3:25 PM -0400 4/5/05, Rodney M Dyer wrote:
>At 03:09 PM 4/5/2005, you wrote:
>>gnu find is not the same as solaris find. the -noleaf option is
>>the equivalent of the default options with solaris (well, unix)
>>find. so since gnu find goes out of its way to work this way,
>>when other finds do not, i see no reason why the filesystem
>>should go out of its way to accomodate it. the reason it works
>>on solaris is not related to afs, if you ran gnu find on solaris,
>>it would work the same way.
>
>That sort-of makes sense.  Sorry, I wasn't suggesting that the bug
>was in AFS, or find.  I just think it is kind of silly that it works
>that way.  Why not "fix" the gnu find so that it requires the
>"-noleaf" option in all circumstances, or a stupid user trick will be
>just to add a "dummy" directory in the directory with the mountpoint.
>Too many exceptions to the rules.

On the filesystems where this trick works, it can result in a pretty
significant speedup for some 'find' operations, especially if you
have some directories with many files in them.  The bug is that
gnu find does this by default for *all* filesystems, instead of
picking the behavior based on filesystem-type.

Note that creating one extra dummy directory will not actually fix
the problem.  It may seem to fix it in some circumstances, but it
is not a reliable fix.  The description of how this (problematic)
optimization trick works was not described quite right earlier in
this thread.

A more tedious explanation:

The problem is that find looks at the "link-count" on a directory to
determine how many sub-directories it has.  You can see that link-
count by doing an 'ls -ld somedir'.  For most unix filesystems,
every directory has one link for the directory itself, and one for
the connection to it's parent's directory.  If you are scanning
through filenames in a directory, you will see those two links as
the "filenames" of '.' and '..'.  And for every directory *inside*
that directory, the link count will be increased by one (that is
because that directory needs the '..' link back to the parent
directory).

So gnu-find takes the link-count, and then keep track of how many
directories it sees while searching through the filenames inside
that directory.  When count-of-directories equals the link-count
for the directory it is searching, then it "must have" found all
the sub-directories for that directory.  No need to look at the
remaining filenames in the directory, no matter how many thousands
of files there might be.

The problem for AFS is that when you do an 'fs mkmount', you will
not increase the link-count for the directory where you added that
mount-point.  Eg:

    ls -ld .
    drwxrwxrwx   2 drosehn  root        2048 Apr  6 23:47 .
    fs mkmount -dir GadTestVol -vol garance.tst
    ls -ld .
    drwxrwxrwx   2 drosehn  root        2048 Apr  6 23:47 .

In both of those 'ls' commands, the link-count is '2', even though
you have added a directory.  So now you create the dummy-directory:

    mkdir dummy
    ls -ld .
    drwxrwxrwx   3 drosehn  root        2048 Apr  6 23:59 .

Now the link-count is 3, but now you also have 4 sub-directories
in this directory:

    ls -la
    total 24
    drwxrwxrwx   3 drosehn  root        2048 Apr  6 23:59 .
    drw-r--r--   2 drosehn  root        2048 Apr  5 20:56 ..
    drwxrwxrwx   2 drosehn  root        2048 Apr  6 23:47 GadTestVol
    drwx------   2 drosehn  user        2048 Apr  6 23:59 dummy

So, will the dummy directory fix the problem?  Not really.  The
problem is that gnu-find sees a link-count of 3, so it will stop
searching for directories as soon as it finds "the third directory".
Assuming the above order, it will stop searching after it sees
"GadTestVol", so it will never search "dummy".  Maybe that works
fine for what you care about, since that *is* just a dummy directory.
But if gnu-find ever sees the "dummy" directory before it sees
"GadTestVol", then it will not search "GadTestVol".  So you're right
back to where you started from.  And if someone has a directory where
both AFS-mounts and real-directories are coming-and-going, it is
almost certain that they will be burned by this bug sooner or later,
even with "dummy" directories.

Really, the only place to fix this is in gnu-find.  Also note that
there are other filesystems besides AFS where this optimization
trick does not work (some filesystems for CD's, for instance).  When
the trick does work, it really can result in a large performance gain.
But when it doesn't work, it will drive you nuts!!

Maybe the stat() call should return some bit which indicates that
the directory-optimization trick will work, and gnu-find could
check that bit...

Or maybe AFS could do something to fake the link-count, so that count
that gnu-find sees from a call to stat() will include all AFS mount
points.  I have no idea if that change would make any sense in the
context of AFS, though.

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu