[OpenAFS] AFS version of du

Steve Simmons scs@umich.edu
Fri, 30 Apr 2010 09:56:25 -0400


On Apr 30, 2010, at 4:15 AM, Staffan H=E4m=E4l=E4 wrote:

> Is there a version of du that does not follow AFS mountpoints?
>=20
> If I try to do a 'du -sh *' in a directory that has some AFS =
mountpoints it inevitably fails after some time. It also takes a lot of =
time when it has to look through things in mounted volumes (e.g. the =
backup volume that I have mounted in many places).
>=20
> I've tried -P and -x to make it skip mount points, but it doesn't work =
(at least on CentOS 5 Linux).

The short answer is that it might be coming coming, but it's going to be =
a while. Here's the detailed scoop as I understand it, based on some old =
mail exchanges I had with the findutils maintainer James Youngman and =
discussion on the findutils mailing list.

Any utility that recurses through the file system (du, find, ls, others) =
has the same set of core problems - things to include or exclude, cross =
or don't cross mount points, etc, etc. This is immensely complicated by =
the breadth of things that can and do comprise mount points. Is a NFS =
mount a mount point? An AFS? Samba shares? Repeat question for every =
conceivable file system, and it becomes hellacious.

Literally years ago, the developers of the GNU findutils suite decided =
to bite this bullet with a common library. Their goal was to have a =
function which could recurse through directories. That function would =
have all the smarts described above. It would then become the core of =
du, ls, etc when those things had to do recursive searches.

They also set some ambitious goals. As was communicated to me at the =
time (I was gently whining about the AFS support in find having been =
broken for a while), they wanted the addition of AFS functionality to =
have little overhead when run in a non-AFS environment. Ditto for SAMBA, =
NFS, etc. They asked if I wanted to work on the AFS support. I declined =
for a variety of reason, but none of them were because it was a bad =
idea. :-)

Instead I decided to wait and see if they could pull this off for the =
core filesystem types they had set out ('native', SAMBA, NFS). If they =
could, I'd look at folding in AFS support and possibly adding some =
AFS-specific features.

It appears that the initial work is now done. With findutils 4.4.0, much =
of the functionality has been moved into a library function fts(). That =
was released March 2009, since then there have been two large fix =
releases followed by a long period of apparent stability - no further =
releases since June 2009.

I built and tested this release just a couple of days ago. When compiled =
in an afs environment, it correctly implements the -fstype afs =
detection. At least in my testing, all the annoying msgs about leaf =
nodes, ./.. confusion, etc, seem to be gone.

With respect to find itself, it now seems possible to pursue some of the =
goals I had when initially looking at making find more useful with AFS. =
Things I think would be useful:

* An option to cross/not cross/detect and report/etc AFS volume mount =
points
* Take specific actions at specific mountpoints either by dir name or by =
volume name, including by regexp. For example, if my home volume and =
subvolumes are all named 'user.scs*', don't follow cross points which =
don't match that regexp for the volume name.

I was going to do this work about four years ago, and set it aside to =
wait for the many issues of fts() to settle out. It appears they've done =
so.

Coming back to how this applies to du/ls/etc - the expectation of the =
findutils developers is that when fts() becomes sufficiently stable, any =
of a number of things would happen:
* those utilities would adopt use of fts() and expand to use find-like =
features
* those utilities might get subsumed into findutils
* find might be implemented with options that make it do ls-like or =
du-like things
* standalone utils that do ls-like or du-like things might be =
implemented

At this point, I can't tell if any of those four possibilities are =
process. Some cursory searches of the findutils mailing lists weren't =
enlightening.=