[OpenAFS] Can't get this going on Coraid CLN22 (Debian).

Ed L. Cashin ecashin@coraid.com
Fri, 30 Mar 2007 15:36:16 -0400


On Thu, Mar 29, 2007 at 08:00:49PM -0500, Tony Shadwick wrote:
> I won't call it "fixed", but with much help from the guys in #openafs, 
> we did get things working.

That's great!

...
> The stack size is set to 8192.  We had to change that to unlimited,
> then things started working, so ulimit -s unlimited.

I see.

> Ed, if you see this...any thoughts on what might cause this?

Well, an OpenAFS process probably has a large array or similar data
structure on its stack (usually a function-local variable in a C
program).

The ulimit is a system setting that prevents processes from using a
large amount of memory for stack space.  On the CLN or other server,
especially one without swap space, that limit could help to prevent a
greedy user process from consuming the RAM that the system needs to
perform well.

The setting is a trade off.  By removing the limit, you give processes
greater freedom while losing the stability that the limit can provide.

In the end, user processes can usually perform a denial of service
attack somehow on the local host, whether it's with the notorious
"fork bomb" or some more insidious exploitation of a weakness in the
kernel.  Still, multi-level security is a good policy.

> I've been instructed to file a bug report on openafs-bugs, and to debian 
> regarding the package, as the /etc/init.d/openafs-filserver script has 
> to be modified to do ulimit -s unlimited at each startup, as the setting 
> is a per-session thing.  Speculation as to the cause is welcome.

A per-session setting sounds like a good solution.

> Please don't think a small thing of this.  I've spent well over 40 
> hours, along with the help of several people to weed this out!

Yes, it sounds like it was quite a lot of work.  I'm glad that the
OpenAFS developers were so helpful and responsive, and I hope that
your solution will be found by others in the mailing list archives.

Congrats on recruiting allies and tracking it down!

-- 
  Ed L Cashin <ecashin@coraid.com>