[OpenAFS-devel] reproducible problem during cache flush

Neulinger, Nathan nneul@umr.edu
Tue, 30 Jul 2002 10:49:56 -0500


Running iozone -a on a recent protos branch build (probably trunk as
well), with a 25 MB cache, I usually see a afs lockup at 32768/8192,
with a 10MB cache, at 16384/16384.=20

After that hang, kdump -global shows:

        afs_mariner =3D 0x0
        afs_freeVCList =3D 0xe0b08c50 XXX
        freeDCList =3D 0x978
        freeDCCount =3D 0x92a (2346)
        discardDCList =3D 0xffffffff
        discardDCCount =3D 0x0 (0)
        freeDSList=3D 0xe105b370 XXXX
        cacheInode =3D 0xc (12)
        volumeInode =3D 0xd (13)
        cacheDiskType =3D 0x0 (0)
        afs_indexCounter =3D 0x0.BA80B (0.763915)
        afs_cacheFiles =3D 0x9c4 (2500)
        afs_cacheBlocks =3D 0x2710 (10000)
        afs_cacheStats =3D 0x2710 (10000)
        afs_blocksUsed =3D 0x2680 (9856)
        afs_blocksDiscarded =3D 0x0 (0)
        afs_fsfragsize =3D 0xfff
        afs_WaitForCacheDrain =3D 0x1 (1)
        afs_CacheTooFull =3D 0x1 (1)
        pagCounter =3D 0x2 (2)

So, it appears there is something happening during a cache flush that's
resulting in deadlock perhaps. It does not recover from this hang. Rest
of the machine appears fine, but any access to afs blocks forever.=20

I do not know if this is related to the other problems I was seeing with
our samba servers, but the lockup feel is the same.=20

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216