[OpenAFS] console messages: "all buffers locked"

Rainer Toebbicke rtb@pclella.cern.ch
Tue, 27 Oct 2009 14:51:24 +0100

Simon Wilkinson schrieb:

> All of that is a long winded way of saying I don't really know what's 
> causing your issue. One key diagnostic question is whether the cache 
> manager continues to operate once it's run out of buffers. If we have a 
> reference count imbalance somewhere, then the machine will never 
> recover, and will report a lack of buffers for every operation it 
> performs. If the cache manager does recover, then it may just mean that 
> we need to look at either having a larger number of buffers, or making 
> our buffer allocation dynamic. Both should be pretty straightforward, 
> for Linux at least.
> What happens to your clients once they've hit the error?

In two cases AFS continued to work. In two others however afs all AFS now 
stops after /afs, and eventually the looong lines with 'all buffers lockedall 
buffers lockedall buffers locked' (you could add a "\n" to your patch while 
you're at it) appear in the syslog.

I'll see if I can crank the 50 up an order of magnitude and track the 
increases. However, this *is* a stress test with about 100 parallel "jobs" per 
client, not yet necessarily a leak, and even 25 simultaneous "FindBlobs" 
aren't unthinkable.

