[OpenAFS] hard drive reliability

John N. Riley III john3@teradactyl.com
Mon, 23 May 2016 13:34:33 -0600


Hard drive quality control has been the subject of a number of industry 
analysts in the last few years as production levels drop and margins 
continue to feel pressure.  Western Digital and Seagate control over 80% 
of the global market for platter-based HDD's. Toshiba is reported to be 
exploring options to divest or sell HDD production.  TechSpot recently 
reported a class action lawsuit against Seagate over "extremely high 
hard drive failure rates."  HDD is a declining commodity and consumers 
need to take steps to protect their intellectual property.

As previously noted, drives to tend to fail in batches and more 
frequently at the beginning and end of their useful life.  That said, we 
do see a difference in enterprise class drive quality.  We also note 
that QC issues seem to be transient among manufacturers with aberrations 
that will arise and then be corrected in all major brands.  We monitor 
these issues and change manufacturers as necessary in our appliances. 
We strongly urge the community to take appropriate steps to backup your 
data and employ methods that accurately determine new and changed data 
to avoid copying over valid data with recently developed corrupted volumes.

On a related note, contaminants are also found in new magnetic tape. 
This industry also has a very limited number of media manufacturers.  In 
this instance, however, at least one vendor, Spectra Logic Corporation, 
will un-package and "carbide" clean the media prior to shipment.  This 
can reduce damage to tape drive heads resulting from particulate 
accumulation and abrasion associated with the foreign material. I will 
have to check to determine if they do this for TS11xx and T10K media in 
addition to the LTO cartridges.  They now support all three media 
technologies.

Teradactyl takes many precautions to protect a cell.  If you plan to 
simply copy data, please employ a strategy that provides protection 
against silent corruption that can be inadvertently placed into your 
"backup" environment.  Recognize that RAID protection is a good first 
step in your primary storage environment but is not a substitute for 
backup.  Finally, consider using multiple media types and employing a 
media verification system to alert you to potential failure modes before 
your situation becomes critical.  The old adage of trust but verify 
definitely holds true with hardware.

John Riley III
-- 
Teradactyl LLC. - Global Storage Solutions