[OpenAFS] AFS namei file servers, SAN, any issues elsewhere? We've had some. Can AFS _cause_ SAN issues?

Jeffrey Altman jaltman@secure-endpoints.com
Thu, 13 Mar 2008 23:13:12 -0400


This is a cryptographically signed message in MIME format.

--------------ms040207050806030809020307
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Kim Kimball wrote:
> We're using Hitachi USP and Hitachi 9585 SAN devices, and have had a 
> series of incidents that, after two years of success, significantly 
> affected AFS reliability for a period of six months.

 From the perspective of a SAN based file system, AFS is just a client
application.   There is nothing that AFS can do that could cause
breakage in a SAN file system provided that the SAN, the hardware
and the drivers connecting the machine hosting the AFS services
to the SAN are not buggy.

AFS is a very stressful application for a file system.  If there are
bugs in the SAN AFS would be more likely to find them than other
applications.

> ========================================
> For the record, here's what I've been experiencing.  The worst of the 
> experience, as detailed below, was the impact on creation of move and 
> release clones but not backup clones
> 
> AFS IMPACT
> 
> We were running 1.4.1 with some patches.  (Upgrading to 1.4.6 has been 
> part of a thus far definitive fix for the 9585 issues.)

The primary difference between 1.4.1 and 1.4.6 is the bundling of
FSync calls which would significantly reduce the load on the
underlying file system.  (Robert Banz gave a good description of
the impact.)  If this change is permitting the SAN to perform its
operations with a reduced incident rate, that would imply that
there is still a problem in the SAN (or the connections between the
host machine and the SAN) but it is not being tickled (as often.)

> The worst of the six month stretch occured when the primary and 
> secondary controller roles (9585 only thus far) were reversed as a 
> consequence of SAN fabric rebuilds.  For whatever reason, the time 
> required to create volume clones for AFS 'vos release' and 'vos move' 
> (using 'vos status' to audit clone time) increased from a typical 
> several seconds to minutes, ten minutes, and in one case four hours.  
> The RW volume is of course unwritable during the clone operation.

My conclusion:
The secondary controller, the cabling, or something else along
that data path is defective.

> 'vos remove' times on afflicted partitions were also affected, with 
> increased time required to remove a volume.
> 
> I don't know why the creation of .backup clones was not similarly 
> affected.  For a given volume the create time/refresh time for a move 
> clone or release clone might have been fifteen minutes, while the 
> .backup clone created quickly and took only slightly longer than usual.

The data is not copied for a .backup until the data actually changes.

> With 'vos move' out of the picture I moved volumes with dump/restore, 
> for volumes not frequently or recently updated, and dump/restore 
> followed by use of a synchronization tool, Unison, to create a new RW 
> volume, followed by changing the mount point to point to the name of the 
> new volume, followed by waiting until the previous RW volume no longer 
> showed any updates for a few days.
> 
> (If anyone is interested in Unison let me know.  I'm thinking of talking 
> about it at Best Practices this year.)

The deadline for submissions is approaching fast.  Please submit your
talk.

> The USP continues to spew SCSI command timeouts.

Bad controller?  Bad cable?  Bad disk?

SCSI command timeouts are at a level far below AFS.  If an AFS service
requests a disk operation and that operation results in SCSI command
timeouts, there is something seriously wrong somewhere between the
SCSI controller and the disk.

No wonder you are getting lousy performance.

> I'm seeing SCSI command timeouts and UFS log timeouts (on vice 
> partitions using the SAN for storage) on LUNS used for vicep's on the 
> Hitachi USP, and was seeing them also on the 9585 until a recent 
> configuration change.

UFS log timeouts are more evidence that the problem is somewhere
between UFS and the disk.

> At first I thought this was load related, so wrote scripts to generate a 
> goodly load.  It turns out that even with a one second sleep between 
> file create/write/close operations and between rm operations the SCSI 
> command timeouts still occur, and that it's not load but simply activity 
> that turns up the timeouts.

And I bet the SAN admins are telling you that there is nothing wrong.
They are badly mistaken.


Jeffrey Altman


--------------ms040207050806030809020307
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEALr5BE3U6n+HWCoLbyhohMwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUzMTA2MTM1N1oX
DTA4MDUzMDA2MTM1N1owczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQCsoz/0+s4Cn65n/3bU3shXw4y5u1uEMEsBOiqNU0PfIKGYQe95b1FKNbNAkctSdQT6GF5c
bhSnJPmb2OOb1frx64dlDgskaG561xa8XPA1aP8Cc+33dgsSLIxGEh97lyUYHEfWBC03KMCF
PKhZfcrGAXoVCrFBadnLAokQbUTFahVg/qQx2IT3wSj1sCIfV5UDuXcEKHCvRtEZIsSzu184
9Cj6I4nY5bt+r94kyDHM94MHYBJi+6tWLFRy2gkIB3HEPmxAiQrKljNpH9bOffiBLIAgmJ6d
1ZXepBXyexQbwOYvftpVlMEFHHQmdiwH3tj69hE78XvM5X9J+SbjbuNpAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQB8FShDN2Ig034Y5eyadiFDEtOvsIJ3Z2xV9aTL4u8xMlz1gZR1
AZAvCv+ZMMRRKWCsrG5tItV8DFPSfWAGMpInmMarA4f76JRLQEUhkRUg8GpkJM5ryk5EDakk
0oiBQcQD8A+UHwrcmaj3UWxQ9zCjDgU+1mY9nEQxZZyp4eeUfzCCAxcwggKAoAMCAQICEALr
5BE3U6n+HWCoLbyhohMwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUzMTA2MTM1N1oXDTA4MDUzMDA2MTM1N1ow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCsoz/0+s4Cn65n/3bU
3shXw4y5u1uEMEsBOiqNU0PfIKGYQe95b1FKNbNAkctSdQT6GF5cbhSnJPmb2OOb1frx64dl
DgskaG561xa8XPA1aP8Cc+33dgsSLIxGEh97lyUYHEfWBC03KMCFPKhZfcrGAXoVCrFBadnL
AokQbUTFahVg/qQx2IT3wSj1sCIfV5UDuXcEKHCvRtEZIsSzu1849Cj6I4nY5bt+r94kyDHM
94MHYBJi+6tWLFRy2gkIB3HEPmxAiQrKljNpH9bOffiBLIAgmJ6d1ZXepBXyexQbwOYvftpV
lMEFHHQmdiwH3tj69hE78XvM5X9J+SbjbuNpAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQB8FShDN2Ig034Y5eyadiFDEtOvsIJ3Z2xV9aTL4u8xMlz1gZR1AZAvCv+ZMMRRKWCsrG5t
ItV8DFPSfWAGMpInmMarA4f76JRLQEUhkRUg8GpkJM5ryk5EDakk0oiBQcQD8A+UHwrcmaj3
UWxQ9zCjDgU+1mY9nEQxZZyp4eeUfzCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNkMIID
YAIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AuvkETdTqf4dYKgtvKGiEzAJBgUrDgMCGgUAoIIBwzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0wODAzMTQwMzEzMTJaMCMGCSqGSIb3DQEJBDEWBBQfhVsh
dRJqQjHHGtniAxaQLKi7ZTBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqGSIb3DQMHMA4GCCqGSIb3
DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCBhQYJKwYB
BAGCNxAEMXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcg
KFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3Vpbmcg
Q0ECEALr5BE3U6n+HWCoLbyhohMwgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYDVQQGEwJa
QTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhh
d3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEALr5BE3U6n+HWCoLbyhohMwDQYJ
KoZIhvcNAQEBBQAEggEAM8IGzONX8f0737Ru6fZHJmO1jzPaFjFzWq3M7bKtcQ3htdpUoXdn
h3IQwiQc7DC+TxActmNsELa6REaaFNvEsr/AsSwrOSrrifNcPq+uGk7DvnGq6SdM+iofrh5M
3Cm8efW1GViViypQNkiFcvjUDk1XVSCPVwXlcPtPboghXjoajAPWv1pEAf0KvrCVdWj+dOZM
I88uHbxKsxn/Xt1b/xixHNgdsgATDYdrPlkKhszA2LYccdUgZOQny2vqtHpK+xipm6Dbkhwy
GX/nO/H9qT1g8WEudyCVTc8H0SR4YOK9O28U3sJzT36R18MToWi6mZY7fUI+AuAzXUWfoflc
vAAAAAAAAA==
--------------ms040207050806030809020307--