[OpenAFS-devel] Same problem in a different place.

Matt Peterson matt@caldera.com
Tue, 29 Jan 2002 17:14:25 -0700


Nickoali,

I was testing what I think is the latest code (osi_sleep.c rev 1.11) that 
appears to have your change we've discussed on the list.

What I found is that there is still one more place we are seeing cpu hog.  I 
spent some time and tracked it all the way through a ton of code to rx_Read() 
and eventually to CV_WAIT() in rx/LINUX/rx_kmutex.h.  

In rx/rx_rdwr.c:225 there is a while() loop that "waits" for a in-sequence 
datagram.  It looks like the following:

          while (call->flags & RX_CALL_READER_WAIT) {
#ifdef  RX_ENABLE_LOCKS
              CV_WAIT(&call->cv_rq, &call->lock);
#else
              osi_rxSleep(&call->rq);
#endif
          }

This loop produces the same behavior as the one you just fixed in 
LINUX/osi_sleep.c in that any task that receives a signal when waiting for 
rcpt of a packet will loop very tightly.

As you can imagine, this behavior is extremely easy to reproduce and often 
manifests itself when users expect the CTRL-C keystroke (or SIGINT) to 
terminate a cp, or ls command that is taking too long.

As a short term solution I'd recommend that you put the same fix you put in 
LINUX/osi_sleep.c into the CV_WAIT() function in LINUX/rx_kmutex.h.

Still, a long term solution for both "signal during wait" problems needs 
to be put into the code.  I should have a few cycles in the next few days and 
would be happy to help.  

If you make the changes to add the additional parameter to the "sleep/wait" 
functions as you proposed earlier

   int afs_osi_Sleep(char *event, int aintrok);

and now...?

   int CV_WAIT(afs_kcondvar_t *cv, afs_kmutex_t *l, int aintrok)

I can help scour the code for places that the parameters should be used so 
that signals are handled in the appropriate places.  

Actually, since afs_osi_Sleep() and CV_WAIT() are used so frequently, it 
might be easier to make just make two new calls:

   int afs_osi_SleepHonorSig(char *event, int aintrok);
   int CV_WAIT_HONORSIG(afs_kcondvar_t *cv, afs_kmutex_t *l, int aintrok)

This way you could #define them to the existing (non-signal calls) for 
non-linux platforms.  Slowly as time and review permits you can replace the 
old calls with the "HonorSig" calls in places where it is possible to return 
the appropriate error code to userland.

-- 
Matt Peterson
Sr. Software Engineer
Caldera, Inc
matt@caldera.com