[OpenAFS] OpenAFS on ZFS (Was: Salvaging user volumes)

Douglas E. Engert deengert@anl.gov
Mon, 17 Jun 2013 08:56:02 -0500


In June of 2010, we were running Solaris AFS file servers on Solaris
with ZFS for partitions on a SATAbeast.

AFS reported I/O error from read() that were ZFS checksums.

Turned out the hardware logs on that SATAbeast were reporting problems
but would continue to serve up the bad data.

Since ZFS is doing checksums when it writes and then again when it reads,
ZFS was catching intermittent errors which other systems might not catch.

Here is a nice explanation of how and why ZFS does checksum.
It also points out other source of corruption that can occur
on a SAN.

http://blogs.sun.com/bonwick/entry/zfs_end_to_end_data

And this one that sounds a lot like our problem!!

http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta

>> And this is one of the reasons why ZFS is so cool :)

Yes it is cool!

>>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>

-- 

  Douglas E. Engert  <DEEngert@anl.gov>
  Argonne National Laboratory
  9700 South Cass Avenue
  Argonne, Illinois  60439
  (630) 252-5444