[OpenAFS-devel] Why do afsd daemons loop tightly after receiving a SIGHUP?

Daniel Jacobowitz dmj+afs@andrew.cmu.edu
Thu, 2 Aug 2001 19:25:54 -0700


On Thu, Aug 02, 2001 at 09:42:59PM -0400, Derek Atkins wrote:
> Jean-Marc Saffroy <saffroy@ri.silicomp.fr> writes:
> 
> > Maybe that ignoring signals would prevent this behaviour ? If afsd
> > processes are not supposed to receive signals, then I guess that this
> > solution would be harmless, right ?
> 
> Unfortunately, no.  There is no way to ignore signals from within the
> kernel.  The problem here is that in normal operations syscalls are
> relatively short-lived.  An application that ignores signals will
> still _receive_ them, but they just get ignored.  In our case, the
> syscall never returns, so the scheduler continually tries to wake up
> the thread to intice the syscall to exit (which never happens).
> 
> The fix, if it is possible, is for the kernel thread to clear the
> signal from the thread so that the scheduler will no longer try to
> wake it up continually.  Most likely the problem is that we are not
> clearing the signal properly, or we are not clearing it in all the
> right places.  But the fix is not as simple as you imagine.

Have you actually tried ignoring the signal?  The multiple signal
delivery paths in Linux are somewhat convoluted, but the primary one
for receiving signals sent by another user process (send_sig_info)
checks ignored_signal().  The signal should actually not be delivered
in that case.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer