[OpenAFS-port-darwin] Mystery Problem with directories on Tiger

Garance A Drosihn drosih@rpi.edu
Sat, 10 Mar 2007 13:25:21 -0500


Recently I have been running into a weird problem with OpenAFS-1.5.15,
which I'm running on MacOS 10.4.8.  I'm pretty sure I started seeing
this before I installed 1.5.15, but I'm not sure what version I was
running when I first noticed it.  I am running with afs.options of
     -afsdb -stat 2000 -dcache 800 -daemons 3 -volumes 70 -dynroot -fakestat

Short description: I have cd'ed into some directory in AFS space, and
am typing in unix commands in a terminal window.  One minute things
will be fine, and the next minute a few specific unix commands will
start to fail consistently, with error msgs such as:
      open[19187] No such file: (null)
And then, at some point later on, everything starts working again.
It might be 5 minutes later, it might be 15 minutes, I can't really
say for sure.  The amount of time probably varies.

Longer:
I usually notice the problem first when doing an 'open' command, because
the way I edit source files is via an alias that does '/usr/bin/open -a
/Developer/Applications/Xcode.app <filename>'.  Most unix commands
continue to work fine, but another very simple one which fails is
/bin/pwd.  The 'pwd' function which is built-in to bash does not get
any errors, but separate /bin/pwd program will say:
      pwd: : No such file or directory

Iirc, all 'pwd' does is get the current working-directory, and then
does repeated stat("..")'s until it makes it back to the root ("/").
I'm pretty sure that all the commands which fail are doing the same
kind of cycle through ".." links.

I've been hitting this a lot more recently, and I think it's because
we're starting to permit things as 'system:authuser rl' instead of
'system:anyuser rl'.  I am authenticated when this problem pops up,
so it's not like my tokens expired.  And things will start working
again without me doing another 'klog'.  I've tried doing things like
'fs checkservers', 'fs checkvolumes' and 'fs flushvolume ...', and
none of those seem to have any effect on the problem.

I know the release page for 1.5.15 says that one known issue is
"When authentication state changes, Finder may cache stale data
(files appear unaccessible or do not appear)", but this problem isn't
happening in the finder.  In fact, I usually do not have *any* finder
windows open (none at all, not even ones looking directories outside
of AFS).  I'm also pretty sure that I had seen this when I was still
running version 1.4.1.

I don't know if it makes any difference, but I'm logged into MacOS
as a local userid 'gad', while my AFS authentication is for userid
'drosehn'.  When I'm working in AFS, I'm always authenticating via
the 'klog' command.

I'm still trying to collect more information on what exactly is
happening, but I thought I'd mention what I've learned so far and
see if this is an issue which people already know about.  Or if anyone
has suggestions on things I should try during the periods where the
problem is active.

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu