[OpenAFS-devel] Very simple patch for libafs CPU hog on signal

Derek Atkins warlord@MIT.EDU
28 Jan 2002 19:09:40 -0500


Matt,

The issue is exactly what you said:  interruping read() or write().
A process that calls a read() can wind up blocking in this same
code.  If you send the process a signal (e.g. SIGINT) while it's
sitting in this loop, your proposed patch will, in effect, cause the
process to lose that signal.

This is a Bad Thing (TM).  The only time you want to always ignore the
signal is in the afsd processes.

See kolya's proposed patch which does have this property.

-derek

Matt Peterson <matt@caldera.com> writes:

> Derek,
> 
> On Monday 28 January 2002 04:28 pm, Derek Atkins wrote:
> > Question: what happens if you really DO want to interrupt the
> > process?  Wont this cause it to fail to interrupt even when the
> > process returns from the AFS Syscall?
> >
> 
> I'm not sure I understand the question...
> 
> I don't see any problems with cases where you really do want to interrupt the 
> task from it's call to afs_osi_Sleep().  The proposed patch does absolutely 
> nothing to affect the workings of calls made to afs_osi_Wakeup().  As you'll 
> notice neither of these calls (afs_osi_Sleep() nor afs_os_Wakeup()) directly 
> change the state of the "current" task.  Changing the state of the "current" 
> task is left to the kernel scheduler via the calls to 
> interruptable_sleep_on() and wake_up().
> 
> Additionally, the processes that exhibit the tight loop problem are those 
> that are buried in "daemon" syscalls -- they never return back to user space 
> unless a terminating call is made via another syscall (i.e. unmount /afs) 
> 
> When a terminating call is made, *that* process grabs the afs lock and sets 
> the data structures that cause the otherwise terminally looping "daemon" 
> syscalls to exit.  Then those processes "sleeping" in the looping "daemon" 
> calls are woken via calls to afs_os_Wakeup(). 
> 
> No where in any of this are signals honored or cared about.  Therefore, it 
> seems safe to ignore them.  In fact, there is code elsewhere in libafs that 
> is intended to ignore all signals.  The patch I propose is simply to ignore 
> them in one more place where they are not being ignored.
> 
> Now, if there is interest in actually making the afs syscalls respond 
> appropriately to signals there appears to be a lot more work to do.  This is 
> what I meant by the following statement:
> 
> > Matt Peterson <matt@caldera.com> writes:
> > Still, signals are not handled the way I'd like them to be, but at least
> > with this fix, processes interrupted in certain libafs syscalls do not
> > completely hog the CPU.
> 
> Specifically, I'd like to be able to interrupt process that are blocking in 
> calls to read() and write().  This way I will be able to kill a copy (cp) or 
> listing (ls) process that is taking too long because it is blocking in the 
> afs kernel module.
> 
> I'd be happy to discuss the pros and cons of various approaches to proper 
> signal handling in libafs.  Again, I haven't looked at all the code, but it 
> looks like it might take a little work to streamline.  
> 
> I agree that this streamlining *might* mean changing the proposed patch, but 
> for now, I don't think the patch does any harm to the legitimate wakeup code.
> 
> 
> -- 
> Matt Peterson
> Sr. Software Engineer
> Caldera, Inc
> matt@caldera.com

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available