[OpenAFS] What filesystem?

Hendrik Hoeth hendrik.hoeth@cern.ch
Wed, 8 Mar 2006 00:37:45 +0100


Thus spake KELEMEN Peter (Peter.Kelemen@cern.ch):
> > I've seen too many software raids (x86, any linux 2.4; never
> > tried 2.6 anymore) corrupt the files under high load, so I trust
> > them less than I trust a single USB harddisk. [...]
> That's very interesting, what kind of load do you expose the RAID to
> in order to trigger corruption?  I've seen many software RAID arrays
> as well, and not a single corruption (due to the RAID layer itself).

A single saturated gigabit link should be enough to trigger it. The
first time we observed problems with corrupt files on software raids was
on a machine with a single gigabit link when a handful clients accessed
files simultaneously. After a bit of experimenting we traced it down to
high-load situations, and then we could easily reproduce the problem by
copying files from several machines onto the raid.

When we got to test the commercial storage I was talking about, we
already had burned our fingers with software raids, and thus we knew
what to try. This system was supposed to act as NFS server for a 500
node cluster and it provided two gigabit interfaces. We connected the
interfaces to two different switches of the cluster and from 50 nodes
per interface we copied large (1G) files to the raid. The load on the
machine was acceptable, the speed was ok, but in the end not a single
file had the right md5sum and in the kernel log we found hundreds of
error messages about the software raid.

I've never tried it on linux 2.6 though.



Fashion is a form of ugliness so intolerable that we have to alter
it every six months.      -- Oscar Wilde