[OpenAFS-devel] Re: progress... sortof...

Nickolai Zeldovich kolya@MIT.EDU
Wed, 28 Apr 2004 18:24:07 -0700


On Wed, 2004-04-28 at 20:05 -0400, Nathan Neulinger wrote:
> Found where it's spinning... Added some instrumentation to the relevant
> routines... DAMN PutVCache gets called a lot... I see about a dozen
> routines in the kernel code that I _REALLY_ would like to know why they
> aren't inlined...

I'd be very surprised if the call/return you incur by not inlining these
calls is amounting to any noticeable amount of overhead...

> Also seems like there is some serious room for optimization of the
> xcache lock when doing lots of putvcache ops in a row... Another
> time...

Similarly, under GLOCK, obtaining and releasing locks is just a simple
increment or decrement operation..  And infact, these lock calls _are_
being inlined, although I'd rather they were real functions (this may
actually speed up things, on modern CPUs).

> Basically seems to get stuck hitting B repeatedly as fast as it can
> without hitting A or C. Works fine for a while, but then gets stuck here.
> 
> iov_len and uio_offset are unsigned, but uio_resid is not... iov_base is a void *. If somehow iov_len underflowed, it would likely cause it to loop forever...
> 
> I'm going to strip out my existing instrumentation and add some more to
> this loop, but if you think of anything useful here, fire me a note,
> cause I'm able to reproduce this very reliably now on some machines.

Maybe you're trying to read past EOF in some files, thereby causing
FOP_READ to return 0 and you're looping forever.  Probably the thing to
do is to break out of the loop if you get 0 return value, in either
case, just to be robust.  Try that, along with a printk to see if that
workaround is being hit at all?

-- kolya