[OpenAFS] Anyone seen this weirdness...

Jeffrey Altman jaltman@secure-endpoints.com
Thu, 13 Apr 2006 16:08:53 -0400


This is a cryptographically signed message in MIME format.

--------------ms060607070302030107040402
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Robert Banz wrote:
>>
>> Could you do some rxdebug calls to the fileserver next time? So we
>> know why it's getting unresponsive.
>> It could be running out of threads. I don't expect that, but it could
>> be ...
> 
> The 'symptoms' seem to be, for the most part, volume-specific.  Slow
> response to accessing that volume, followed by the clients seeing a
> timeout on it.  So, guts-o-the-fileserver folk, is there a volume-wide
> lock that gets set by a particular fileserver thread when it's being
> acted upon?  Since deleting a whole-bunch-of-files (a /bin/rm -fr <dir>)
> is happening, that's a whole lot of requests coming in in-series to that
> volume, being taken care of on (probably) a first-come, first-served
> basis, leaving little room for other clients to get an op in on that
> volume?

Lets say that there are N clients who are all attempting to use
the contents of directory "D" in the read-write volume "V".  Client 1
is making changes to the contents of the directory and clients 2 to N
are reading the directory.

In order for each of the N clients to read the directory, they need
to perform a FetchData RPC which registers a callback with the file
server on "D".  Now each time that client 1 makes a change to the
contents of "D", each of the callbacks that are currently registered
with the file server must be broken in order to notify clients 2 to N
that the data they have cached is no longer valid.  If clients 2 to N
are actively using "D", then when the callback break is received from
the file server they will in turn attempt to perform a new FetchData
operation to obtain the latest data value.  This in turn registers a
new callback.

Now if client 1 is performing 30,000 individual RemoveFile RPCs it is
going to extremely hard for any of the other clients to maintain to be
able to maintain a callback until all 30,000 RPCs are completed.  As
soon as a FetchData operation completes, the callback will be broken
and the cache contents will be invalidated.

I don't believe there is a bug here its just a negative side effect
of client side caching and the fact that the file system is only given
one file name at a time to act on.

Jeffrey Altman

--------------ms060607070302030107040402
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJXzCC
AwowggJzoAMCAQICAw7NrTANBgkqhkiG9w0BAQQFADBiMQswCQYDVQQGEwJaQTElMCMGA1UE
ChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNv
bmFsIEZyZWVtYWlsIElzc3VpbmcgQ0EwHhcNMDUwNTI3MTc0NzU3WhcNMDYwNTI3MTc0NzU3
WjBzMQ8wDQYDVQQEEwZBbHRtYW4xFTATBgNVBCoTDEplZmZyZXkgRXJpYzEcMBoGA1UEAxMT
SmVmZnJleSBFcmljIEFsdG1hbjErMCkGCSqGSIb3DQEJARYcamFsdG1hbkBzZWN1cmUtZW5k
cG9pbnRzLmNvbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKjPyrF+rdjOUSK/
bWwZHdx5p1+y6iiCd4vvYEVDxouYFp5C/fZEWm5n45ubBUbMSUI1MAZN6ooEoH09UTj6BXhM
S8B987ls81dKOIUphTF2jOzq8gsFmeA15yHMRAD20LqUWeLyvYk8FCNQw+dsKMMhX+WdsxOm
RY/1jPkJL6oN8kEwoUFkOX9/OfWWh6oFnV6faiEHUKDMFubsb9X0KVD8iIeR7Cxz7i4kXqRX
wMlp2fyoxcDIJrBaTY8nA++g3p34IkWt1a5po6g683nIgSnGpwYIwuJheBqSEZfLYWa+1KdD
6Sn27Ud94GqUvPVG5jC6zVC5EJ2aWuoAu+nNuV8CAwEAAaM5MDcwJwYDVR0RBCAwHoEcamFs
dG1hbkBzZWN1cmUtZW5kcG9pbnRzLmNvbTAMBgNVHRMBAf8EAjAAMA0GCSqGSIb3DQEBBAUA
A4GBADtvO//tjiAV6VJGtoNtrl34mB5jGyGTiotzw8riB6zz0GvY11bcWDmp6JKif+pVG+8L
IySDosbuva13qu2HwYUxBmWc7CoNd2k9kRlcrfbDUTTrGOZK8qyqNqT3gQZTAa9ZnUI0su9G
y/n2o5bQcaYdqR3htNrpvdLSPOWhILOXMIIDCjCCAnOgAwIBAgIDDs2tMA0GCSqGSIb3DQEB
BAUAMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBM
dGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQTAeFw0w
NTA1MjcxNzQ3NTdaFw0wNjA1MjcxNzQ3NTdaMHMxDzANBgNVBAQTBkFsdG1hbjEVMBMGA1UE
KhMMSmVmZnJleSBFcmljMRwwGgYDVQQDExNKZWZmcmV5IEVyaWMgQWx0bWFuMSswKQYJKoZI
hvcNAQkBFhxqYWx0bWFuQHNlY3VyZS1lbmRwb2ludHMuY29tMIIBIjANBgkqhkiG9w0BAQEF
AAOCAQ8AMIIBCgKCAQEAqM/KsX6t2M5RIr9tbBkd3HmnX7LqKIJ3i+9gRUPGi5gWnkL99kRa
bmfjm5sFRsxJQjUwBk3qigSgfT1ROPoFeExLwH3zuWzzV0o4hSmFMXaM7OryCwWZ4DXnIcxE
APbQupRZ4vK9iTwUI1DD52wowyFf5Z2zE6ZFj/WM+Qkvqg3yQTChQWQ5f3859ZaHqgWdXp9q
IQdQoMwW5uxv1fQpUPyIh5HsLHPuLiRepFfAyWnZ/KjFwMgmsFpNjycD76DenfgiRa3Vrmmj
qDrzeciBKcanBgjC4mF4GpIRl8thZr7Up0PpKfbtR33gapS89UbmMLrNULkQnZpa6gC76c25
XwIDAQABozkwNzAnBgNVHREEIDAegRxqYWx0bWFuQHNlY3VyZS1lbmRwb2ludHMuY29tMAwG
A1UdEwEB/wQCMAAwDQYJKoZIhvcNAQEEBQADgYEAO287/+2OIBXpUka2g22uXfiYHmMbIZOK
i3PDyuIHrPPQa9jXVtxYOanokqJ/6lUb7wsjJIOixu69rXeq7YfBhTEGZZzsKg13aT2RGVyt
9sNRNOsY5kryrKo2pPeBBlMBr1mdQjSy70bL+fajltBxph2pHeG02um90tI85aEgs5cwggM/
MIICqKADAgECAgENMA0GCSqGSIb3DQEBBQUAMIHRMQswCQYDVQQGEwJaQTEVMBMGA1UECBMM
V2VzdGVybiBDYXBlMRIwEAYDVQQHEwlDYXBlIFRvd24xGjAYBgNVBAoTEVRoYXd0ZSBDb25z
dWx0aW5nMSgwJgYDVQQLEx9DZXJ0aWZpY2F0aW9uIFNlcnZpY2VzIERpdmlzaW9uMSQwIgYD
VQQDExtUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgQ0ExKzApBgkqhkiG9w0BCQEWHHBlcnNv
bmFsLWZyZWVtYWlsQHRoYXd0ZS5jb20wHhcNMDMwNzE3MDAwMDAwWhcNMTMwNzE2MjM1OTU5
WjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRk
LjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0EwgZ8wDQYJ
KoZIhvcNAQEBBQADgY0AMIGJAoGBAMSmPFVzVftOucqZWh5owHUEcJ3f6f+jHuy9zfVb8hp2
vX8MOmHyv1HOAdTlUAow1wJjWiyJFXCO3cnwK4Vaqj9xVsuvPAsH5/EfkTYkKhPPK9Xzgnc9
A74r/rsYPge/QIACZNenprufZdHFKlSFD0gEf6e20TxhBEAeZBlyYLf7AgMBAAGjgZQwgZEw
EgYDVR0TAQH/BAgwBgEB/wIBADBDBgNVHR8EPDA6MDigNqA0hjJodHRwOi8vY3JsLnRoYXd0
ZS5jb20vVGhhd3RlUGVyc29uYWxGcmVlbWFpbENBLmNybDALBgNVHQ8EBAMCAQYwKQYDVR0R
BCIwIKQeMBwxGjAYBgNVBAMTEVByaXZhdGVMYWJlbDItMTM4MA0GCSqGSIb3DQEBBQUAA4GB
AEiM0VCD6gsuzA2jZqxnD3+vrL7CF6FDlpSdf0whuPg2H6otnzYvwPQcUCCTcDz9reFhYsPZ
Ohl+hLGZGwDFGguCdJ4lUJRix9sncVcljd2pnDmOjCBPZV+V2vf3h9bGCE6u9uo05RAaWzVN
d+NWIXiC3CEZNd4ksdMdRv9dX2VPMYIDOzCCAzcCAQEwaTBiMQswCQYDVQQGEwJaQTElMCMG
A1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBl
cnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECAw7NrTAJBgUrDgMCGgUAoIIBpzAYBgkqhkiG
9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0wNjA0MTMyMDA4NTNaMCMGCSqG
SIb3DQEJBDEWBBQqCYibUbGWffVtLeQ09TWG8ckBPTBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqG
SIb3DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG
9w0DAgIBKDB4BgkrBgEEAYI3EAQxazBpMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3
dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJl
ZW1haWwgSXNzdWluZyBDQQIDDs2tMHoGCyqGSIb3DQEJEAILMWugaTBiMQswCQYDVQQGEwJa
QTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhh
d3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECAw7NrTANBgkqhkiG9w0BAQEFAASC
AQBxSQtHODahd+rM6WyCFRyYgj/rbyj247N7ZphJwbbKMCDah774bThBrfFBlwUbbFBx8SZI
IL6wqb70oYT1C8uzj4bFlryJWfuiEgAbZvM0jJ3zhRMQ4hE4peQz/BMtfjkWgYG/wbDM8Q3Z
h9jVQZlFzuKsO3zkGSS2iFcgpqKSEMHMtwo6ZbP14X2Q+YBQQPIzqTgFHGUU69gt8SHXiV8p
rOayOj2BxQqBXSJKQ/dDleYV1ZOpaUFZbKit22tz2tRy/UkOT41yjO4vHvhY2+iQYRi0kLqy
dP78YSDcp+8v1hGD0Xm74sfU95BzgwV9bz+V3Egnkx9z856uXlMp+8RXAAAAAAAA
--------------ms060607070302030107040402--