[OpenAFS-devel] Cache inconsistency in client 1.4.8 and above

Felix Frank Felix.Frank@Desy.de
Mon, 20 Apr 2009 11:43:05 +0200 (CEST)


Hi again,

sorry for posting lots. I think this is interesting though. This is how 
mmap_test manages to deadlock itself now (with patched antirecursion), 
without even recursing into osi_VM_StoreAllSegments.

For the fun of it, I removed the call to posix_fadvise from mmap_test, 
and writing works for 300MB file vs. 50MB cache, but this seems to lead to 
pdflush deadlocking again.

Regards
  - Felix

mmap_test     D ffff88002cfc9938     0  2019   1984 
(NOTLB)
  ffff88002cfc98b8  0000000000000282  0000000100000001  ffff88003a514210
  0000000000000009  ffff88003fadc100  ffff88003fac4080  000000000081eabc
  ffff88003fadc2e8  ffffffff802639f9
Call Trace:
  [<ffffffff802639f9>] _spin_lock_irqsave+0x9/0x14
  [<ffffffff80228ae5>] sync_page+0x0/0x42
  [<ffffffff8025c5a5>] getnstimeofday+0x10/0x28
  [<ffffffff80228ae5>] sync_page+0x0/0x42
  [<ffffffff802626bb>] io_schedule+0x3f/0x67
  [<ffffffff80228b23>] sync_page+0x3e/0x42
  [<ffffffff802627ff>] __wait_on_bit_lock+0x36/0x66
  [<ffffffff802410ae>] __lock_page+0x5e/0x64
  [<ffffffff8029a01a>] wake_bit_function+0x0/0x23
  [<ffffffff8021ce2c>] mpage_writepages+0x13b/0x34d
  [<ffffffff881adbf6>] :libafs:afs_linux_writepage+0x0/0x8a
  [<ffffffff8025c9fb>] do_writepages+0x29/0x2f
  [<ffffffff80250ee9>] __filemap_fdatawrite_range+0x50/0x5b
  [<ffffffff881ab890>] :libafs:osi_VM_StoreAllSegments+0xbb/0x173
  [<ffffffff88174b28>] :libafs:afs_StoreAllSegments+0xaf/0x18c5
  [<ffffffff8804a945>] :ext3:ext3_discard_reservation+0x53/0x66
  [<ffffffff8020d691>] dput+0x84/0x114
  [<ffffffff80212fd7>] __fput+0x16c/0x198
  [<ffffffff8022cca6>] mntput_no_expire+0x19/0x89
  [<ffffffff80223c6a>] filp_close+0x5c/0x64
  [<ffffffff8818c7c9>] :libafs:afs_UFSWrite+0x82e/0x84b
  [<ffffffff88173e0f>] :libafs:PagInCred+0x30/0xa8
  [<ffffffff881abd1d>] :libafs:afs_linux_writepage_sync+0x300/0x3f3
  [<ffffffff881adc57>] :libafs:afs_linux_writepage+0x61/0x8a
  [<ffffffff8021ce9c>] mpage_writepages+0x1ab/0x34d
  [<ffffffff881adbf6>] :libafs:afs_linux_writepage+0x0/0x8a
  [<ffffffff8025c9fb>] do_writepages+0x29/0x2f
  [<ffffffff80250ee9>] __filemap_fdatawrite_range+0x50/0x5b
  [<ffffffff802bcc78>] sys_fadvise64_64+0x146/0x187
  [<ffffffff8025f106>] system_call+0x86/0x8b
  [<ffffffff8025f080>] system_call+0x0/0x8b


On Mon, 20 Apr 2009, Felix Frank wrote:

> Last week, I posted a patch to RT #124627, only to notice this morning that
> I'd been testing with a large-ish cache.
> Also, there appears to be some regression in my own fiddlings, as #124627 
> isn't
> solved at all. For what it's worth, the deadlock as reported in #120491 is
> apparently prevented, although mmap_test still gets blocked in some I/O-wait
> state.
>
> Cache consistency with mmap'ed writes can apparently only be achieved when
> no afs_linux_writepage_sync's are aborted at all.
>
> I think I still haven't truly understood the exact idea of the antirecursion
> patch. To me, it seems to omit writing some precious data.
>
> Regards
> - Felix
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>