[OpenAFS] Cache partition choice still limited to ext2 on Linux?

Troy Benjegerdes hozer@hozed.org
Wed, 7 Nov 2012 15:25:40 -0600


> > Last time I used memcache, I had issues with Java applications (Eclipse,
> > SQLDeveloper). They brought the system to high load until they were finally
> > OOM-killed when run under KDE on a machine with 4G RAM (512M or 1G of which
> > set apart for the memcache).
> 
> In my (limited) experience with memcache, it doesn't behave very well
> if the system is memory contrained and is under pressure.

Most network filesystems either explode, or go really slow if the system
is memory constrained. In HPC systems (Cray, in particular), there is no
disk swap, and lots of effot is expended to ensure that the resident set
size of whatever is running is less than 85% of available memory, 'wasting'
5-10% of RAM. You can in theory overcommit more, and keep all your RAM
busy, but you are likely to slow down (or take out) the network filesystem
in some edge case, which then tends to bring everything to a halt because
you start evicting pages to something like, say sshd, which then goes back
to the network filesystem to pull it back in because the administrator 
tried to figure out why in the world this thing's slow.

By the time you hit this situation, users and administrators restart the
node because it's 'not responding', when if you just gave it 15 minutes,
the OOM killer might eventually kick in and kill the memory hog application
(or the browser with too many open tabs)

My opinion is this situation would be better if there were more applications
that could correctly respond to 'connection timeout' I/O errors gracefully,
but most seem to hammer on the filesystem with retries in that case.