[OpenAFS] Re: enobufs, "No buffer space available" messages

Andrew Deason adeason@sinenomine.net
Sat, 12 Jul 2014 22:37:19 -0500

On Fri, 11 Jul 2014 17:13:44 -0700 (PDT)
Renata Maria Dart <renata@slac.stanford.edu> wrote:

> Hi, I found some earlier discussion on openafs about this error
> message popping out, it seems when a transaction has taken "longer"
> than expected.  I couldn't find anything though that described a
> "fix".  Is there a fix?  Or is the fix to get a faster server?  :-)

For the issue where you get the wrong error message, that was fixed in
1.6 as Jeff described.

For the issue where the server takes "too long", there is no single fix,
because the cause is different in a variety of situations. In some
cases, there is no fix, because sometimes disk operations just take that
long. Jeff mentioned a general possible cause involving callbacks, but
other things can cause it, too. If it is disk-performance-related,
sometimes using '-sync never' can help. The only sure-fire way to know
what the cause is is to raise the debugging level when it happens, or
capture a core or pstack, etc. Anything that causes the operation to
take longer than a minute on the server side causes that to happen.

For the issue of the ridiculous behavior that a server kills a call
after it runs for a minute, that is not yet fixed. There is a series of
changes starting with http://gerrit.openafs.org/10773 that are a
proposed fix for that, but it's not a priority for me at the moment, so
it's not getting attention. There is some care to be taken there with
changing timeout-related behavior to make sure nothing else is broken at
the same time.

> The discussion that I am referring to took place in October of last year
> with the subject "No buffer space available" reported by Stephan
> Wiesand.
> This is the last of the discussion that I have found:

For reference, this is referring to this thread, I believe:

Andrew Deason