[OpenAFS-devel] 1.4.0-rc4 weirdness

Jim Rees rees@umich.edu
Tue, 15 Nov 2005 14:06:08 -0500


  The interesting thread will probably be the CheckHost thread.

Maybe, and I've got a patch for this I'm testing now.  But I think this is a
different problem.  The bug I'm chasing makes all the worker threads hang
waiting for more space in the callback table.  The server eventually
recovers.

The problem Christopher describes shows many calls waiting for a thread, and
yet the pstack shows many threads waiting for a call.  And the server never
recovers.  Looks like the worker threads aren't waking up, or aren't finding
the calls when they do.

The CheckHost loop does hog host locks, but only one at a time.