[OpenAFS] Can't get this going on Coraid CLN22 (Debian).

Ed L. Cashin ecashin@coraid.com
Fri, 30 Mar 2007 16:45:09 -0400


On Fri, Mar 30, 2007 at 03:23:36PM -0500, Tony Shadwick wrote:
> I think what I'm going to try then today is plug a spare SATA hard drive 
> into the SR1520, set it up as a raid0, let the CLN pick it up, and 
> designate it as swap (or I guess I could use LVM and do the same?) and 
> see if that fixes things as well.  If so, then the simplest thing to 
> instruct CLN users to do is make sure they allot swap prior to 
> attempting OpenAFS.

I'm not sure that with the Linux kernel as it exists today it's a good
idea to swap on AoE (or other network-based) storage, because there
isn't yet a good way for the packets that say, "Yes, the data made it
to disk," to get processed without using up memory.

When flushing out dirty pages to network storage in order to free up
memory for other uses, there is a potential for deadlock.  Usually you
can tell the Linux VM subsystem not to get into a situation where it
won't have enough free pages around to receive network packets, but
still...

It seems like a more likely solution would be to find a good ulimit
that works for the OpenAFS process and still catches crazy stack
allocations, or else for the OpenAFS code to just switch from
stack-based memory to malloced memory.

-- 
  Ed L Cashin <ecashin@coraid.com>