[OpenAFS-port-darwin] 10.4.11 troubles
Jonas Maebe
jonas.maebe@elis.ugent.be
Mon, 10 Dec 2007 12:00:45 +0100
Hello,
Since upgrading to Mac OS X 10.4.11, I've already twice experienced
the following problem: suddenly, both the kernel and mds go full
blast (both using close to 100% cpu on my dual G5), with the system
log being deluged by a barrage of the following message:
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! drog events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_eve add_event: event queue is
full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
Dec 10 11:31:38 bigmac kernel[0]: fs_events: add_event: event queue
is full! dropping events.
[etc]
(as you can see, syslogd sometimes can't even keep up)
The first time this happened, it was while running a test program (on
an AFS volume) which first opens and then closes as many files as
possible (I believe with an upper limit of 100 files, but I'm not
certain and currently cannot find it anymore). The second time, it
was at the very end of an "svn update" (again on an AFS volume). It
had apparently fully completed, as after the reboot "svn cleanup" did
not mention anything that had to be cleaned up. Afaik, svn also
closes a lot of locking files at the end of an update.
I cannot reproduce the problem reliably though.
Once the kernel and mds are in that cycle, there's no way to kill -9
the triggering process, and killing mds doesn't help either.
Rebooting doesn't work either (it hangs, presumably while trying to
kill the hanging process), and a forced reboot is required.
One more thing: both cases was with a prerelease of OpenAFS 1.4.5
(the cvs version in which all the panics had been fixed). I don't
know if later on some more things were committed which could solve
this problem, so now I've installed the official 1.4.5 release.
Is there anything I can do to further debug this problem?
Jonas