[OpenAFS-devel] OpenAFS, Linux and truncate_inode_pages()
chas williams - CONTRACTOR
chas@cmf.nrl.navy.mil
Wed, 01 Mar 2006 09:36:55 -0500
In message <44056FA3.9050106@pclella.cern.ch>,Rainer Toebbicke writes:
>fails in the sense that the file just copied no longer equals the
>original file, i.e. the compare fails. The data at the place where it
>starts to differ is data of the same file (or one of its other
>copies), meaning this is not a question of unallocated buffers or such.
i dont quite get what you are saying here, but it could be a problem
with the cache manager and not necessarily specific to the linux
afs client. when you get a corrupted file, is it corrupted only
on the local client or did the corrupted version get sent back
to the server?
>I already have a osi_Assert(down_trylock(&ip->i_sem) == 0) in
>osi_VM_FlushPages(). So far no smoke. Not a surprise actually.
based on some testing: osi_VM_FlushVCache() can safely take i_sem for
all cases, osi_VM_TryToSmush() already holds i_sem in all cases,
osi_VM_FlushPages can safely take i_sem for all cases, and osi_VM_Truncate
already holds i_sem for all cases.
note, that all of these are going to serialized/protected by
AFS_GLOCK (and other vnode locks). of course, this doesnt protect
against whatever the linux kernel internal might do, but generally
they do run up against AFS_GLOCk at some point.
osi_VM_StoreAllSegments() does have me a bit concerned. it drops
AFS_GLOCK. further it appears to have three code paths:
i_sem was already locked
[<c0104a78>] dump_stack+0x17/0x19
[<f8d2c1e1>] osi_VM_Truncate+0x68/0x71 [libafs]
[<f8cf282f>] afs_TruncateAllSegments+0xef/0x6e2 [libafs]
[<f8cfc9a9>] afs_setattr+0x162/0x3b9 [libafs]
[<f8d2ba4a>] afs_notify_change+0x86/0x12e [libafs]
[<c017ec5a>] notify_change+0x2de/0x3ad
[<c01628c7>] do_truncate+0x63/0x7e
[<c0173668>] may_open+0x1b6/0x222
[<c017375c>] open_namei+0x88/0x632
[<c0163986>] filp_open+0x1e/0x39
[<c0163c66>] do_sys_open+0x35/0xb1
[<c0163cf3>] sys_open+0x11/0x13
[<c0103b59>] syscall_call+0x7/0xb
i_sem was NOT locked
[<c0104a78>] dump_stack+0x17/0x19
[<f8d2c116>] osi_VM_StoreAllSegments+0x13c/0x1a1 [libafs]
[<f8cf0307>] afs_StoreAllSegments+0xa9/0x1ff9 [libafs]
[<f8d2d797>] afs_linux_flush+0xf1/0x204 [libafs]
[<c0163d3d>] filp_close+0x26/0x67
[<c0163df1>] sys_close+0x73/0x95
[<c0103b59>] syscall_call+0x7/0xb
i_sem was NOT locked
[<c0104a78>] dump_stack+0x17/0x19
[<f8d2c116>] osi_VM_StoreAllSegments+0x13c/0x1a1 [libafs]
[<f8cf0307>] afs_StoreAllSegments+0xa9/0x1ff9 [libafs]
[<f8d0b6ed>] afs_StoreOnLastReference+0xed/0x11f [libafs]
[<f8ce1a9e>] BStore+0x60/0xe0 [libafs]
[<f8ce211f>] afs_BackgroundDaemon+0x340/0x3aa [libafs]
[<f8d3003d>] afsd_thread+0x432/0x780 [libafs]
[<c0101f11>] kernel_thread_helper+0x5/0xb
i suspect sys_open/sys_close are fairly well synchronized. however,
the afs background daemon is a little different. i havent been
able to determine if i_sem should be held during the filemap operations
to protect against other page operations.