[OpenAFS-devel] client-idledeadtime-support-20080430 causes connection time outs

Jeffrey Altman jaltman@secure-endpoints.com
Tue, 09 Feb 2010 17:03:18 -0500


This is a cryptographically signed message in MIME format.

--------------ms030805020001020005080907
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 9/4/2009 5:53 AM, Rainer Toebbicke wrote:
> I believe client-idledeadtime-support-20080430 is a bit unfair, at leas=
t
> it introduced the following problem, possibly on a wrong assumption:
>=20
> time 502.579567, pid 2195: Analyze RPC op 6 conn 0x4ae0b240 code
> 0xfffffffd user 0x0
> time 502.579581, pid 2195: afs_Analyze out shouldRetry 0
> time 502.579608, pid 2195: Returning code -3 from 21
>=20
> in words: an RXAFS_RemoveFile received an RX_CALL_TIMEOUT (-3), there i=
s
> no alternative server, and hence the operation is not retried. It would=

> have been retried in the RX_CALL_DEAD (-1) case (and plenty of other
> cases).
>=20
> The comment in the code justifies that special treatment for
> RX_CALL_TIMEOUT on the grounds that the call has timed out while server=

> was still responding to other calls. From the RX code I fail to see how=

> this is necessarily correct: rxi_CheckCall can be called from the
> keepalive mechanism and stop the call without any interaction from the
> server, you can get RX_CALL_TIMEOUT and not e.g. RX_CALL_DEAD even if
> the server is completely hosed.
>=20
> In this particular case the server was stopped, the "hard mount"
> functionality should have ensured that I/Os stay pending until the
> server was restarted.
>=20
> Now, I don't know how to solve the yoyo up-down-up problem in case a
> server times out calls selectively. When it happens, for
> single-server-volumes the call should be retried in most cases.
>=20
> Would it be reasonable to blacklist only up to the last server and then=

> go the usual path which ensures a fair retry?
>=20
> (Actually, I wonder whether you can ensure you never get RX_CALL_TIMEOU=
T
> if a server is simply restarted, which certainly deserves a retry).

There was a bug in the idle dead timeout processing in the rx library.
A timeout would occur if the send window was full for longer than the
timeout period.  This is fixed by

  http://gerrit.openafs.org/#change,1183

Jeffrey Altman


--------------ms030805020001020005080907
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDAyMDkyMjAzMThaMCMGCSqGSIb3DQEJBDEWBBQnf+jW
RZSCthpeLiTEG0rBn/n0RTBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBACKZTMACNahFKUbmkle0YP68a0X02b5hMC1b
e+RwokdGafLT5nLKbB2yBp5DYtHvDrGbcOsBkwDxzTsVMU/ZM4N/e5v5jG85DcOcK6dvn5rf
612ClJ3ESlWSE9Glxt+PyY+lVR6eJ50RrN9cl8eSmWrRJtcOwhR9tmjPvrXekXBu4/u50F0Z
81RdCIMWHACWCYsoAop70JFuFty1kCPTot+wvs0DsXZ/QVt7RVAlzfURrNljty7MrFz9zNV3
PylOkaU+ikMjH5VcWfumlLmKkST9xN3CU8W6GGrfng2vKGoP90m0rCLa4Bzw4zN2Wl8GzGhW
CdgXnRgEv8QqDTTsj88AAAAAAAA=
--------------ms030805020001020005080907--