[OpenAFS] Overview? Linux filesystem choices
Robert Milkowski
milek@task.gda.pl
Thu, 30 Sep 2010 20:48:45 +0100
On 30/09/2010 13:19, Stephan Wiesand wrote:
> Hi Jeff,
>
> On Sep 29, 2010, at 22:18 , Jeffrey Altman wrote:
>
>
>> RAID is not a replacement for ZFS. ZRAID-3 protects against single bit
>> disk corruption errors that RAID cannot. Only ZFS stores a checksum of
>> the data as part of each block and verifies it before delivering the
>> data to the application. If the checksum fails and there are replicas,
>> ZFS will read the data from another copy and fixup the damaged version.
>> That is what makes ZFS so special and so valuable. If you have data
>> that must be correct, you want ZFS.
>>
>
> you're right, of course. This is a very desirable feature, and the main reason why I'd love to see ZFS become available on linux.
>
> I disagree on the "RAID cannot provide this" statement though. RAID-5 has the data to detect single bit corruption, and RAID-6 even has the data to correct it. Alas, verifying/correcting data upon read is not a common feature. I know of just one vendor (DDN) actually providing it. It's a mystery to me why the others don't.
>
>
Most of the raid controller do not check any parity on reads if a raid
group is not degraded.
In case of RAID-5 this would make them very slow (as you would need
entire stripe of data). Not to mention that in ZFS it works with any
RAID configuration, including stripe for meta-data (by default).
ZFS always checks its checksums on reads and transparently fixes any
corruption if it can.
Additionally ZFS uses *much* stronger checksums that what you have in
raid controllers - currently it uses even sha256 if you want it. It
means it can detect much more than just a single bit errors.
Another good feature is that with zfs you get a so called end-to-end
checksumming - if data corruption happened anywhere from a disk to
memory (medium errors? driver bugs? SAN? ...) zfs should be able to
detect it and fix it. Not that long ago I was hit by a data corruption
by one of a SAN switches... fortunately ZFS dealt with it (and other
switch was fine). The other time there was something wrong with one of a
scsi cards which under load was corrupting data.... fortunately we used
zfs mirror between two different jbods via separate scsi cards so from
an application point of view all was fine. Etc...
Not that it happens everyday... but when it does then ZFS just rocks :)
And *no* fsck :)
+ compression, deduplication, ... and much more :)
--
Robert Milkowski
http://milek.blogspot.com