[OpenAFS-devel] A few questions about the current Linux implementation of the AFS client

Matt Peterson matt@caldera.com
Mon, 21 Jan 2002 12:04:13 -0700


Derrick,

On Monday 21 January 2002 10:53 am, Derrick J Brashear wrote:
> On Mon, 21 Jan 2002, Matt Peterson wrote:
> > It is common to see the OpenAFS client become tied up in an afs_syscall
> > that consumes 100% of the CPU.
>
> Have you seen this is cases where afsd has not had signals sent to it?

I was not clear in the previous email, but it is not only afsd that will 
become tied up in syscalls, but any process that happens to be using afs.  A 
very common example is interruption of cp command via SIGINT.  Lets say that 
I am copying a very large file from a local device to somewhere on /afs.  If 
I use bash so if I press ctrl-c the cp process will be send SIGINT.  As soon 
as this happens, cp will begin to take 100% of the CPU.  

The point is that it is not only afsd that can't handle signals, but any 
process that happens to be in the wrong place in libafs when a signal is 
received.  Friday I had a half-hour to debug the problem.  I need more time 
to look at the problem, but my initial suspect is afs_osi_sleep().

Later this week I should have a little more free time and I can offer more 
details...

> We have patches onhand which should fix the signal problem but basically
> currently if you signal afsd it will chew up CPU because it never handles
> the signals. The patches

This is great news.  Perhaps I'm wasting time looking for a fix if it is 
already fixed?  Should these fixes be in the "daily shapshots"?

-- 
Matt Peterson
Sr. Software Engineer
Caldera, Inc
matt@caldera.com