[OpenAFS-devel] Why do afsd daemons loop tightly after receiving a SIGHUP?

Derek Atkins warlord@MIT.EDU
02 Aug 2001 21:42:59 -0400


Jean-Marc Saffroy <saffroy@ri.silicomp.fr> writes:

> Maybe that ignoring signals would prevent this behaviour ? If afsd
> processes are not supposed to receive signals, then I guess that this
> solution would be harmless, right ?

Unfortunately, no.  There is no way to ignore signals from within the
kernel.  The problem here is that in normal operations syscalls are
relatively short-lived.  An application that ignores signals will
still _receive_ them, but they just get ignored.  In our case, the
syscall never returns, so the scheduler continually tries to wake up
the thread to intice the syscall to exit (which never happens).

The fix, if it is possible, is for the kernel thread to clear the
signal from the thread so that the scheduler will no longer try to
wake it up continually.  Most likely the problem is that we are not
clearing the signal properly, or we are not clearing it in all the
right places.  But the fix is not as simple as you imagine.

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available