[OpenAFS] Overview? Linux filesystem choices

Robert Milkowski milek@task.gda.pl
Thu, 30 Sep 2010 20:48:45 +0100


On 30/09/2010 13:19, Stephan Wiesand wrote:
> Hi Jeff,
>
> On Sep 29, 2010, at 22:18 , Jeffrey Altman wrote:
>
>    
>> RAID is not a replacement for ZFS.  ZRAID-3 protects against single bit
>> disk corruption errors that RAID cannot.  Only ZFS stores a checksum of
>> the data as part of each block and verifies it before delivering the
>> data to the application.  If the checksum fails and there are replicas,
>> ZFS will read the data from another copy and fixup the damaged version.
>> That is what makes ZFS so special and so valuable.  If you have data
>> that must be correct, you want ZFS.
>>      
>
> you're right, of course. This is a very desirable feature, and the main reason why I'd love to see ZFS become available on linux.
>
> I disagree on the "RAID cannot provide this" statement though. RAID-5 has the data to detect single bit corruption, and RAID-6 even has the data to correct it. Alas, verifying/correcting data upon read is not a common feature. I know of just one vendor (DDN) actually providing it. It's a mystery to me why the others don't.
>
>    

Most of the raid controller do not check any parity on reads if a raid 
group is not degraded.
In case of RAID-5 this would make them very slow (as you would need 
entire stripe of data). Not to mention that in ZFS it works with any 
RAID configuration, including stripe for meta-data (by default).

ZFS always checks its checksums on reads and transparently fixes any 
corruption if it can.
Additionally ZFS uses *much*  stronger checksums that what you have in 
raid controllers - currently it uses even sha256 if you want it. It 
means it can detect much more than just a single bit errors.

Another good feature is that with zfs you get a so called end-to-end 
checksumming - if data corruption happened anywhere from a disk to 
memory (medium errors? driver bugs? SAN? ...) zfs should be able to 
detect it and fix it. Not that long ago I was hit by a data corruption 
by one of a SAN switches... fortunately ZFS dealt with it (and other 
switch was fine). The other time there was something wrong with one of a 
scsi cards which under load was corrupting data.... fortunately we used 
zfs mirror between two different jbods via separate scsi cards so from 
an application point of view all was fine. Etc...

Not that it happens everyday... but when it does then ZFS just rocks :)

And *no* fsck :)

+ compression, deduplication, ... and much more :)

-- 
Robert Milkowski
http://milek.blogspot.com