[OpenAFS] OpenAFS client cache overrun?

Jeffrey Altman jaltman@your-file-system.com
Thu, 06 Mar 2014 17:20:45 -0500


This is a cryptographically signed message in MIME format.

--------------ms080005030306020807050808
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Andrew previously explained the steps you can take to collect
information regarding the state of the afs cache manager.  I never saw
any follow up e-mails containing the requested state information.  If
the problem is a dead lock in the afs cache manager, the only way to fix
it is to identify where the dead lock resides.

I wish to provide some advice regarding the use of Outlook Personal
Folders (.pst) over a network file system.  DON'T!   A quick Google
search of "outlook pst network file system" will show you that Microsoft
says not to do so as do many other web sites.   The Outlook PST is an
indexed database that acts as a cache of data stored on mail servers.
It is not meant to be stored in a redirected portion of the user
profile.  It is meant to be local and small.

Accessing Outlook PST files over a network increases (not decreases) the
amount of network traffic generated by Outlook.  Outlook assumes the
file is on local disk.  The .pst file is the equivalent of a Microsoft
Access Database and access to it makes heavy use of byte range oplocks.
 Byte range locking is not supported by OpenAFS.  The native Windows AFS
client goes to great lengths to simulate byte range lock semantics and
Windows share modes to ensure that data that is modified is flushed to
the file servers at the appropriate times.  I cannot say the same for
Samba sitting on top of a UNIX afs cache manager.  The interactions
between SMB protocol and AFS protocol are not pretty.

You might want to try enabling afs cache bypass

  http://docs.openafs.org/Reference/1/fs_bypassthreshold.html

for files over a couple of GBs.  That way when the PST files are
accessed they won't force the cached data for other users out of the cach=
e.

In the end though I would strongly recommend instructing your user to
not store Outlook PST files in /afs if they are going to access it via a
SMB share.

Jeffrey Altman



On 3/6/2014 4:32 PM, Eric Chris Garrison wrote:
> We upgraded the gateways mentioned in the original email to
> openafs-client-1.6.6-0.pre1 awhile back, since there was a bugfix for
> cache overrun in it (thanks for the help, Derrick). And for awhile it
> seemed like it had worked, our AFS clients on the gateway hosts weren't=

> locking up.
>=20
> But the problem is back. We've had many lockups, requiring reboot, over=

> the past month, usually happening in clusters, like a user locking up
> one host, them moving to another to lock it up.
>=20
> After taking "smbstatus" snapshots each time it locks up, I've finally
> found a common factor: large Outlook .pst files being locked:
>=20
> One host:
>=20
> 4229         46516      DENY_ALL   0x7019f     RDWR       NONE         =
=20
>   /afs/iu.edu/home/b/e/xxxxxx =20
> xxxxxx@ads.iu.edu/BL-ECON-WY214-1/Data/C/Users/xxxxxx/Documents/Outlook=

> Files/Xxx
> x Xxxxxx E-Mail Archive (2006-2011) (2014_02_27 18_02_49 UTC).pst   Mon=

> Mar  3 13:57:23 201412868        46516      DENY_ALL   0x7019f     RDWR=

>       NONE             /afs/iu.edu/home/b/e/xxxxxx =20
> xxxxxx@ads.iu.edu/BL-ECON-WY214-1/Data/C/Users/xxxxxx/Documents/Outlook=

> Files/Xxxx Xxxxxx E-Mail Archive (2006-2011) (2014_02_27 18_02_49
> UTC).pst   Mon Mar  3 16:30:32 2014
> 30686        46516      DENY_ALL   0x7019f     RDWR       NONE         =
=20
>   /afs/iu.edu/home/b/e/xxxxxx =20
> xxxxxx@ads.iu.edu/BL-ECON-WY214-1/Data/C/Users/xxxxxx/Documents/Outlook=

> Files/Xxxx Xxxxxx E-Mail Archive (2006-2011) (2014_02_27 18_02_49
> UTC).pst   Thu Mar  6 14:53:39 2014
>=20
> On another host:
>=20
> /home/b/e/xxxxxx =20
> xxxxxx@ads.iu.edu/BL-ECON-WY214-1/Data/C/Users/xxxxxx/Documents/Outlook=

> Files/Xxxx Xxxxxx E-Mail Archive (2006-2011) (2014_02_27 18_02_49
> UTC).pst   Mon Mar  3 15:21:53 2014
> ecg-ss2:24849        46516      DENY_ALL   0x7019f     RDWR       NONE =
=20
>           /afs/iu.edu/home/b/e/xxxxxx =20
> xxxxxx@ads.iu.edu/BL-ECON-WY214-1/Data/C/Users/xxxxxx/Documents/Outlook=

> Files/Xxxx Xxxxxx E-Mail Archive (2006-2011) (2014_03_06 18_11_27
> UTC).pst   Thu Mar  6 14:44:26 2014
>=20
> These are always present on each host that's locked up. Same .pst file,=

> even. It is a 6.5 GB file. Our AFS client cache is 7GB in size on a 9GB=

> partition.
>=20
> I'm writing to the user to see if he's doing anything extraordinary.=20
>=20
> Still looking for ideas. I haven't tried Kim Kaball's idea of lowering
> the cache size to 2.5GB, I may try that next, but I worry that it'll
> impact performance too much.
>=20
> Thanks!!!
>=20
> Chris Garrison
> Indiana University=20
> UITS Research Storage
>=20
> From: Chris Garrison <ecgarris@iu.edu <mailto:ecgarris@iu.edu>>
> Date: Wednesday, November 20, 2013 4:47 PM
> To: "openafs-info@openafs.org <mailto:openafs-info@openafs.org>"
> <openafs-info@openafs.org <mailto:openafs-info@openafs.org>>
> Subject: [OpenAFS] OpenAFS client cache overrun?
>=20
> Hello,
>=20
> We have some RHEL 5.5 servers with openafs-client-1.6.1-1 running. Ther=
e
> are 4 of them in a round-robin DNS, with Apache and Samba sitting on to=
p
> of OpenAFS filesystem.
>=20
> The hosts' /etc/sysconfig/openafs files look like this:
>=20
>   # OpenAFS Client Configuration
>   AFSD_ARGS=3D"-dynroot -fakestat-all -daemons 8 -chunksize 22"
>=20
> The hosts' /usr/vice/etc/cacheinfo files look like this:
>=20
>   /afs:/usr/vice/cache:7500000
>=20
> I realize it's better for users to all use the openafs client for their=

> own OS, but we have a large base of users who insist on wanting to just=

> map a drive without installing a client. We have been running like this=

> for 8+ years now, it's not a new setup.
>=20
> Something has been locking up the openafs client in the past month or
> so.  The cache will show as more and more full in "df" and then at some=

> point, AFS stops answering, and any attempt to do a directory listing o=
r
> to access a file results in a zombie process. =20
>=20
> The zombie processes mount up fast, the load on the machine skyrockets,=

> and the only solution seems to be to reboot.
>=20
> What could cause that lockup? It's usually only on one host at a time,
> and seems like it will "move" from host to host, even returning to the
> same host in the same day after reboot once in awhile.
>=20
> I doubled the cache size on these hosts, and it seemed to slow things
> down, but we had another lockup today after a restart of all the client=
s
> on Sunday during a hardware upgrade on the SAN, so no host had been
> running more than 3 days.
>=20
> To me, it feels like maybe someone is forcing a huge file through and
> running the machine out of cache. Though if that's so, I wonder why it
> only just started happening after all these years. If nothing else, it
> seems like something new is going on with the user end that's causing i=
t.
>=20
> Any help would be appreciated, anything from a fix by limiting somethin=
g
> in the openafs client or the cache or ideas as to what someone could be=

> doing. Because at this point, it's like a denial of service attack
> that's making lots of problems for us.
>=20
> Thank you,
>=20
> Chris Garrison
> Indiana University Research Storage


--------------ms080005030306020807050808
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIINITCC
BkIwggUqoAMCAQICEDirAC//rpa3Vv85Wvtd5xswDQYJKoZIhvcNAQEFBQAwgcoxCzAJBgNV
BAYTAlVTMRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UECxMWVmVyaVNpZ24gVHJ1
c3QgTmV0d29yazE6MDgGA1UECxMxKGMpIDE5OTkgVmVyaVNpZ24sIEluYy4gLSBGb3IgYXV0
aG9yaXplZCB1c2Ugb25seTFFMEMGA1UEAxM8VmVyaVNpZ24gQ2xhc3MgMSBQdWJsaWMgUHJp
bWFyeSBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eSAtIEczMB4XDTExMDkwMTAwMDAwMFoXDTIx
MDgzMTIzNTk1OVowgaYxCzAJBgNVBAYTAlVTMR0wGwYDVQQKExRTeW1hbnRlYyBDb3Jwb3Jh
dGlvbjEfMB0GA1UECxMWU3ltYW50ZWMgVHJ1c3QgTmV0d29yazEeMBwGA1UECxMVUGVyc29u
YSBOb3QgVmFsaWRhdGVkMTcwNQYDVQQDEy5TeW1hbnRlYyBDbGFzcyAxIEluZGl2aWR1YWwg
U3Vic2NyaWJlciBDQSAtIEc0MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxuwn
/R1j9DsdisHTHMjIgoa2uEqGkqqBXHLKMA0vnkEiVzAhJZCao/SsKsaIF4ZhchN2LuwDyyeb
jyCAN+DkitpVplAP/LlcI2mJQqG6H6/vDvmkyQrx+DeyxtmSSq5937hEH5u6P4wG/tgjT0hR
I2pghKjuJy9g35byGiqMPI8AzE/L+iCOvDX24fCatgXz/B0/xhR7DtryBeTTgwKmxWlwtKnk
VunbHVz0pjbia7UeKi3cvrvuOgSwMAitX2hsxr0GloiE5+apZC28ODC7iCbDZ2ZmtLR3+cCh
xw5y72bi5bnK4POFdzWY3tQcsP5mceI4y258T0BV65fZqBge7QIDAQABo4ICRDCCAkAwOAYI
KwYBBQUHAQEELDAqMCgGCCsGAQUFBzABhhxodHRwOi8vcGtpLW9jc3AudmVyaXNpZ24uY29t
MBIGA1UdEwEB/wQIMAYBAf8CAQAwbAYDVR0gBGUwYzBhBgtghkgBhvhFAQcXATBSMCYGCCsG
AQUFBwIBFhpodHRwOi8vd3d3LnN5bWF1dGguY29tL2NwczAoBggrBgEFBQcCAjAcGhpodHRw
Oi8vd3d3LnN5bWF1dGguY29tL3JwYTA0BgNVHR8ELTArMCmgJ6AlhiNodHRwOi8vY3JsLnZl
cmlzaWduLmNvbS9wY2ExLWczLmNybDAOBgNVHQ8BAf8EBAMCAQYwKQYDVR0RBCIwIKQeMBwx
GjAYBgNVBAMTEVZlcmlTaWduTVBLSS0yLTk3MB0GA1UdDgQWBBSt+cOTci21uShh5KTXYNXE
Cl4aATCB8QYDVR0jBIHpMIHmoYHQpIHNMIHKMQswCQYDVQQGEwJVUzEXMBUGA1UEChMOVmVy
aVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdvcmsxOjA4BgNVBAsT
MShjKSAxOTk5IFZlcmlTaWduLCBJbmMuIC0gRm9yIGF1dGhvcml6ZWQgdXNlIG9ubHkxRTBD
BgNVBAMTPFZlcmlTaWduIENsYXNzIDEgUHVibGljIFByaW1hcnkgQ2VydGlmaWNhdGlvbiBB
dXRob3JpdHkgLSBHM4IRAItbdVaEVIULAM+vOEjOsaQwDQYJKoZIhvcNAQEFBQADggEBANaP
wdqbiPKzbE0fWC+6AVFddMFG6MO4e5/WQPHv/zK6iWvADjRDn6SZ5qTwXUgzYoWFYf4jiCKM
YJsrnGVJlMSiOCRIpVylUEto6WIip5PomSJuPVu7EEIOH0x1RzRWCY/4vYw881y70pZwVHBi
Te/REL6dSCxe7IZrB4LwPeElJygs4BZ2HrP95WKW0oo9Xyuu+1zCE7dlY8s0dkOf1oeZq26t
lcEAP0Yngf813iMOQ9wUXzL5yinvwlIw9ZnduYH4OiUgjYJo8rkhhXRmBOGGORYy8i3WKqjJ
3tkAAk/jGCDFpYFWtpXe04Kt+HslvmR8LqC6cCz4+XXidE0HbYQwggbXMIIFv6ADAgECAhB4
sMGg25SPPErGZAEUkQeTMA0GCSqGSIb3DQEBBQUAMIGmMQswCQYDVQQGEwJVUzEdMBsGA1UE
ChMUU3ltYW50ZWMgQ29ycG9yYXRpb24xHzAdBgNVBAsTFlN5bWFudGVjIFRydXN0IE5ldHdv
cmsxHjAcBgNVBAsTFVBlcnNvbmEgTm90IFZhbGlkYXRlZDE3MDUGA1UEAxMuU3ltYW50ZWMg
Q2xhc3MgMSBJbmRpdmlkdWFsIFN1YnNjcmliZXIgQ0EgLSBHNDAeFw0xMzEyMjMwMDAwMDBa
Fw0xNTAxMTYyMzU5NTlaMIHOMS4wLAYDVQQDDCVQZXJzb25hIE5vdCBWYWxpZGF0ZWQgLSAx
MzU4Mjc2MTA4NjMxMSswKQYJKoZIhvcNAQkBFhxqYWx0bWFuQHlvdXItZmlsZS1zeXN0ZW0u
Y29tMQ8wDQYDVQQLDAZTL01JTUUxHjAcBgNVBAsMFVBlcnNvbmEgTm90IFZhbGlkYXRlZDEf
MB0GA1UECwwWU3ltYW50ZWMgVHJ1c3QgTmV0d29yazEdMBsGA1UECgwUU3ltYW50ZWMgQ29y
cG9yYXRpb24wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDSrqYUguvbguthxGNq
M15noPYGLMnpjRKT2VS88MxNAZ7RaplB8Azrk8vOH+q+IWnXCrap+BevY27PZW6UgNAPcETG
FTi/qdYAukHwnCV7fvjXXJEOw3jg+eK/06bhr0uThvmrjT+jWHlpzK3mSDPtEBSkgXDbLkL/
LQfYvay0Ia7n65l5Ry4zHlrg6uJ+UqvWJZwXazXjo2H4EksGCM4nrKHTeVoj5oSquvqs3tSf
BytXLGVqSOHqjXb+lri1gtlovX7AjMT2gdONRrjR3wun6tjHvoqjUNZ2mUs0XXh0vI0GyTKd
taz26xY+iKboxFO2atDbb1Gm8KdUXqO/UivlAgMBAAGjggLVMIIC0TAMBgNVHRMBAf8EAjAA
MA4GA1UdDwEB/wQEAwIFoDAgBgNVHSUBAf8EFjAUBggrBgEFBQcDBAYIKwYBBQUHAwIwHQYD
VR0OBBYEFMC1SMuRefXyNAwkZM+Lgu7iJ6nFMCcGA1UdEQQgMB6BHGphbHRtYW5AeW91ci1m
aWxlLXN5c3RlbS5jb20wHwYDVR0jBBgwFoAUrfnDk3IttbkoYeSk12DVxApeGgEwggErBggr
BgEFBQcBAQSCAR0wggEZMIIBFQYIKwYBBQUHMAKGggEHbGRhcDovL2RpcmVjdG9yeS52ZXJp
c2lnbi5jb20vQ04lMjAlM0QlMjBTeW1hbnRlYyUyMENsYXNzJTIwMSUyMEluZGl2aWR1YWwl
MjBTdWJzY3JpYmVyJTIwQ0ElMjAtJTIwRzQlMkMlMjBPVSUyMCUzRCUyMFBlcnNvbmElMjBO
b3QlMjBWYWxpZGF0ZWQlMkMlMjBPVSUyMCUzRCUyMFN5bWFudGVjJTIwVHJ1c3QlMjBOZXR3
b3JrJTJDJTIwTyUyMCUzRCUyMFN5bWFudGVjJTIwQ29ycG9yYXRpb24lMkMlMjBDJTIwJTNE
JTIwVVM/Y0FDZXJ0aWZpY2F0ZTtiaW5hcnkwXQYDVR0fBFYwVDBSoFCgToZMaHR0cDovL3Br
aS1jcmwuc3ltYXV0aC5jb20vY2FfNTYxYzEwMzY5MGM5N2E2OTI0N2EwZWYwNzFhYzgxYWYv
TGF0ZXN0Q1JMLmNybDBsBgNVHSAEZTBjMGEGC2CGSAGG+EUBBxcBMFIwJgYIKwYBBQUHAgEW
Gmh0dHA6Ly93d3cuc3ltYXV0aC5jb20vY3BzMCgGCCsGAQUFBwICMBwaGmh0dHA6Ly93d3cu
c3ltYXV0aC5jb20vcnBhMCoGCmCGSAGG+EUBEAMEHDAaBhFghkgBhvhFARABAgIEAYazFxYF
MTA5MjIwDQYJKoZIhvcNAQEFBQADggEBAEUyacJvoRfQdglYgnUwaTMsRRg0YeAljbnb8M5E
vBSo3u/LhvbXtvu+9uE8R6UOE4GvKH382I27vjuM28oHqfii04URAB1icmA8b7rxYQo9Ob2I
/NkkQRBwbA3HGLWXFjupODWbP5WylyySAAI7HxG2xbE4X+8+hMJVOKfxJb6J0SUOBlnmMkmg
nAxgOM4venSmli6U3o0nADHNLZEJjqym2QstkeYPhDZ6sSO3t/yv+JyYbfb01hiOdhGsDBif
oPTqcWRvA+lqbWMHJG3p9uL/kI4jbLj9/ZkMfdRDHpQNVAuGxxyj7b1pxM0jBuTP0Jmrcz3U
wUwT5kjCCDt2gGAxggRSMIIETgIBATCBuzCBpjELMAkGA1UEBhMCVVMxHTAbBgNVBAoTFFN5
bWFudGVjIENvcnBvcmF0aW9uMR8wHQYDVQQLExZTeW1hbnRlYyBUcnVzdCBOZXR3b3JrMR4w
HAYDVQQLExVQZXJzb25hIE5vdCBWYWxpZGF0ZWQxNzA1BgNVBAMTLlN5bWFudGVjIENsYXNz
IDEgSW5kaXZpZHVhbCBTdWJzY3JpYmVyIENBIC0gRzQCEHiwwaDblI88SsZkARSRB5MwCQYF
Kw4DAhoFAKCCAmswGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcN
MTQwMzA2MjIyMDQ1WjAjBgkqhkiG9w0BCQQxFgQUs2KzfM1E0JJkirXU/Agy5XQq04YwbAYJ
KoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4G
CCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCB
zAYJKwYBBAGCNxAEMYG+MIG7MIGmMQswCQYDVQQGEwJVUzEdMBsGA1UEChMUU3ltYW50ZWMg
Q29ycG9yYXRpb24xHzAdBgNVBAsTFlN5bWFudGVjIFRydXN0IE5ldHdvcmsxHjAcBgNVBAsT
FVBlcnNvbmEgTm90IFZhbGlkYXRlZDE3MDUGA1UEAxMuU3ltYW50ZWMgQ2xhc3MgMSBJbmRp
dmlkdWFsIFN1YnNjcmliZXIgQ0EgLSBHNAIQeLDBoNuUjzxKxmQBFJEHkzCBzgYLKoZIhvcN
AQkQAgsxgb6ggbswgaYxCzAJBgNVBAYTAlVTMR0wGwYDVQQKExRTeW1hbnRlYyBDb3Jwb3Jh
dGlvbjEfMB0GA1UECxMWU3ltYW50ZWMgVHJ1c3QgTmV0d29yazEeMBwGA1UECxMVUGVyc29u
YSBOb3QgVmFsaWRhdGVkMTcwNQYDVQQDEy5TeW1hbnRlYyBDbGFzcyAxIEluZGl2aWR1YWwg
U3Vic2NyaWJlciBDQSAtIEc0AhB4sMGg25SPPErGZAEUkQeTMA0GCSqGSIb3DQEBAQUABIIB
ALHVUP1SediZmIaCKR9w/la0oEekTeBQ8cIyyCRDwWYQX/SxzXEyguyAwemP67ET4ctsphLy
wiTj09QckJ06ZHcj0ui29ssMOLIShgpXweHk8J9lk/4HhUfDuf91XIs3jbF7IoinCp9NAiHN
2H0kPfneqkC7ZByFAbRTOVJRMXVgJLQVH2GcX570HMyiVywe9QJD3Tt6l0+FmfZPP06eJvkp
GterwT6qL4m/v8aVLh0iq6pmO2T9S/maftxxfMc4sel6ems5a+F5xZMyqITBo1ZPOC7Lgrcx
oPz0x1Ke2R5oS70W+aWCeT9vl+dtKRHI0I9TqahZ+f7Aa0a5p6LywmEAAAAAAAA=
--------------ms080005030306020807050808--