[OpenAFS-devel] Very simple patch for libafs CPU hog on signal

Matt Peterson matt@caldera.com
Tue, 29 Jan 2002 10:53:02 -0700


Hi,

I finally received the list email with Nickolai's patch (included below for 
context) 

--- src/afs/LINUX/osi_sleep.c   2002/01/24 20:13:22     1.10
+++ src/afs/LINUX/osi_sleep.c   2002/01/28 23:55:55
@@ -223,9 +223,22 @@
     seq = evp->seq;
 
     while (seq == evp->seq) {
+       sigset_t saved_set;
+
        AFS_ASSERT_GLOCK();
        AFS_GUNLOCK();
+       spin_lock_irq(&current->sigmask_lock);
+       saved_set = current->blocked;
+       sigfillset(&current->blocked);
+       recalc_sigpending(current);
+       spin_unlock_irq(&current->sigmask_lock);
+
        interruptible_sleep_on(&evp->cond);
+
+       spin_lock_irq(&current->sigmask_lock);
+       current->blocked = saved_set;
+       recalc_sigpending(current);
+       spin_unlock_irq(&current->sigmask_lock);
        AFS_GLOCK();
     }
     relevent(evp);

Later, Nickolai also wrote the following:

> But in my patch, the sigset I'm saving is the blocked sigset, not the
> pending sigset.  When the signal is delivered during sleep, the kernel
> doesn't wake us up, because the signal is masked out in the blocked
> sigset.  When we come out of sleep, we restore the blocked set to what
> it was before (probably empty) and recompute the sigpending flag, so
> the signal will be delivered on syscall completion.

I do like Nickolai's patch better.  The main difference is that signals are 
not flushed in the while() loop they are just ignored.  I'm not completely 
familiar with how signals are pended when blocked, but if it works like 
explained above, then I agree that this is a better fix.   I'll verify the 
patch and comment later if I find any problems.

Question: Is there any reason why you could not save off current->blocked and 
restore it outside of the while() loop?

Also, can someone explain what is meant by the following?

Nickolai Zeldovich <kolya@MIT.EDU> wrote:
> Unfortunately, this has the side effect of artificially increasing the 
> system load to ~4 when AFS is idle, because the background daemons 
> block waiting for requests.

Is Nickolai is right on this one?  Tasks with a state of TASK_INTERRUPTIBLE 
do not appear to increase the system load.  If they did, you'd already see an 
increase of 4 as the daemon syscalls currently spend most of their time in 
interruptable_sleep_on() (even without Nickolai's patch).  Basically, I don't 
see how Nickolai's patch would change system load.

Ultimately, the fact remains that there is absolutely NO WAY to get out of 
afs_osi_Sleep() via a signal -- even with Nickolai's patch and regardless of 
whether or not the caller of afs_osi_Sleep() has the capability of letting 
the signal be handled by a userland caller.  The reason for this is that 
afs_osi_Sleep() (because of the (while (seq == evp->seq) ) will loop 
endlessly until _some_other_task_ calls afs_osi_Wake(). 

Nickolai's patch eliminates the tight looping problem, but still does not 
allow the caller to handle the signal in the way that you'd expect since 
afs_osi_Sleep() does not return unless something else calls afs_osi_Wake().  

If my application which is write()ing a large block of data ever gets stuck 
in afs_osi_Sleep() then there is _absolutely_no_way_ that a signal will ever 
get me out.  I have to wait until something else interrupts the sleep.  How 
long do I have to wait?

The interactions of the various daemon tasks and the "normal" VFS syscalls 
are fairly complex and I have not traced through the code long enough to 
understand them all.  If I make a call to write() which eventually (via the 
VFS calls) lands in a afs_osi_Sleep() is there any way that I can get the 
behavior that I want -- where the VFS syscall can be immediataly interrupted 
via a SIGINT?  Or do I have to wait until some afsd task wakes my sleeping 
syscall?

From what I've seen of the code (again I've only looked at it for a few 
hours), I get the feeling that in order to make the code signal safe, we will 
have to remove the while(seq == evp->seq) loop from afs_osi_Sleep().  

Each function that calls afs_osi_Sleep() (and there are many) needs to call 
signal_pending() to see if the sleep was interrupted by a signal -- then 
either ignore the signal or return from the syscall.  

Nickolai reports that code was checked into CVS (which I have not yet seen) 
that blocks signals in daemon syscalls.  If this is the case then we won't 
have to worry about checking signal_pending() in those code paths, but we 
would still need to check signal_pending() the all the code paths that are 
directly traversed by tasks that enter libafs via VFS syscalls.

At Nickolai's recommendation I am starting by looking at the afs_UFSRead() 
and afs_UFSWrite() calls.   I am also becoming more familiar with the way 
that VFS calls run through the libafs code since these are the code paths 
that need to be streamlined.  

-- 
Matt Peterson
Sr. Software Engineer
Caldera, Inc
matt@caldera.com