[OpenAFS] System hangs, OSX 10.6.8, OpenAFS 1.6.5

dorian taylor dorian.taylor.lists@gmail.com
Thu, 28 Nov 2013 20:10:42 -0800


Hello List,

I'm trying to diagnose a persistent problem since I upgraded an old
Mac from 10.5 to 10.6 and installed the latest OpenAFS.

Essentially what happens is that when the Kerberos ticket expires,
there's a limited time (seconds to minutes) during which I can renew
both the Kerberos ticket and the AFS token. If that time elapses, the
system hangs and I have to hard power-cycle the machine.

The particular way in which the system hangs is peculiar. Essentially,
all existing processes run fine, but if I try to invoke a new one
(including kinit, top, lsof), it hangs indefinitely. Moreover,
existing processes that use network resources (e.g. a browser) can't
make subsequent network calls, even though connections which are
currently open (e.g. MP3 streaming) continue, but won't start again
unless they're stopped. Finally, when I try to close the programs
using either the AFS filesystem or network sockets, they can't even be
killed with a -9. The only recourse is to pull the plug. The reboot,
shutdown, or init commands don't even work.

The only other time I have seen something like this is on an OpenBSD
system with an NFS mount that goes away with a file handle open, or
when something has exhausted the user's quota of open file handles.

Has anybody seen anything like this before? How would I go about
diagnosing it when the wherewithal to do so is crippled itself?

Thanks in advance,

-- 
Dorian Taylor
http://doriantaylor.com/