[OpenAFS] AFS namei file servers, SAN, any issues elsewhere?
We've had some. Can AFS _cause_ SAN issues?
Thu, 13 Mar 2008 23:13:12 -0400
This is a cryptographically signed message in MIME format.
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Kim Kimball wrote:
> We're using Hitachi USP and Hitachi 9585 SAN devices, and have had a
> series of incidents that, after two years of success, significantly
> affected AFS reliability for a period of six months.
From the perspective of a SAN based file system, AFS is just a client
application. There is nothing that AFS can do that could cause
breakage in a SAN file system provided that the SAN, the hardware
and the drivers connecting the machine hosting the AFS services
to the SAN are not buggy.
AFS is a very stressful application for a file system. If there are
bugs in the SAN AFS would be more likely to find them than other
> For the record, here's what I've been experiencing. The worst of the
> experience, as detailed below, was the impact on creation of move and
> release clones but not backup clones
> AFS IMPACT
> We were running 1.4.1 with some patches. (Upgrading to 1.4.6 has been
> part of a thus far definitive fix for the 9585 issues.)
The primary difference between 1.4.1 and 1.4.6 is the bundling of
FSync calls which would significantly reduce the load on the
underlying file system. (Robert Banz gave a good description of
the impact.) If this change is permitting the SAN to perform its
operations with a reduced incident rate, that would imply that
there is still a problem in the SAN (or the connections between the
host machine and the SAN) but it is not being tickled (as often.)
> The worst of the six month stretch occured when the primary and
> secondary controller roles (9585 only thus far) were reversed as a
> consequence of SAN fabric rebuilds. For whatever reason, the time
> required to create volume clones for AFS 'vos release' and 'vos move'
> (using 'vos status' to audit clone time) increased from a typical
> several seconds to minutes, ten minutes, and in one case four hours.
> The RW volume is of course unwritable during the clone operation.
The secondary controller, the cabling, or something else along
that data path is defective.
> 'vos remove' times on afflicted partitions were also affected, with
> increased time required to remove a volume.
> I don't know why the creation of .backup clones was not similarly
> affected. For a given volume the create time/refresh time for a move
> clone or release clone might have been fifteen minutes, while the
> .backup clone created quickly and took only slightly longer than usual.
The data is not copied for a .backup until the data actually changes.
> With 'vos move' out of the picture I moved volumes with dump/restore,
> for volumes not frequently or recently updated, and dump/restore
> followed by use of a synchronization tool, Unison, to create a new RW
> volume, followed by changing the mount point to point to the name of the
> new volume, followed by waiting until the previous RW volume no longer
> showed any updates for a few days.
> (If anyone is interested in Unison let me know. I'm thinking of talking
> about it at Best Practices this year.)
The deadline for submissions is approaching fast. Please submit your
> The USP continues to spew SCSI command timeouts.
Bad controller? Bad cable? Bad disk?
SCSI command timeouts are at a level far below AFS. If an AFS service
requests a disk operation and that operation results in SCSI command
timeouts, there is something seriously wrong somewhere between the
SCSI controller and the disk.
No wonder you are getting lousy performance.
> I'm seeing SCSI command timeouts and UFS log timeouts (on vice
> partitions using the SAN for storage) on LUNS used for vicep's on the
> Hitachi USP, and was seeing them also on the 9585 until a recent
> configuration change.
UFS log timeouts are more evidence that the problem is somewhere
between UFS and the disk.
> At first I thought this was load related, so wrote scripts to generate a
> goodly load. It turns out that even with a one second sleep between
> file create/write/close operations and between rm operations the SCSI
> command timeouts still occur, and that it's not load but simply activity
> that turns up the timeouts.
And I bet the SAN admins are telling you that there is nothing wrong.
They are badly mistaken.
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature