[OpenAFS-devel] OpenAFS, Linux and truncate_inode_pages()

Rainer Toebbicke rtb@pclella.cern.ch
Fri, 03 Mar 2006 10:04:54 +0100


chas williams - CONTRACTOR wrote:

>>>+    if (down_trylock(&ip->i_sem))
>>>+	need_unlock = 0;
>>>+
>>
>>This doesn't look sane to me! If you manage to down the i_sem then 
>>you're ok if the i_sem is really needed for filemap_fdatasync, however 
>>if you don't then you don't really know whether it's because you 
>>already did it yourself higher up or if "somebody else" (not 
>>necessarily AFS code) did and is then dangerously close to messing 
>>around with the structures.
> 
> 
> if you read my previous mail, there are three paths into this code
> section.  one already holds i_sem (the kernel takes it), and two
> others dont.  i could take i_sem outside this routine for the others
> (afs_linux_close/afs_linux_flush and BStore), but that's just changing
> more files than i would like.  for a test its simple enough to see if we
> should take the lock.
> 
> if you dont like the patch, dont try it.

Please do not misunderstand: I brought up this subject because the AFS 
kernel code misbehaves with respect to i_sem calling 
truncate_inode_pages. When asking for for a second (and third...) 
opinion I already pointed out that taking or not taking the Linux 
locks is just a matter of how tight AFS's control over the underlying 
Linux structures is.

I hadn't yet even noticed the issue involving filemap_fdatasync. It is 
a valuable piece of information that there are only three code paths 
from inside AFS to that routine. But it does not exclude that there 
are other places in the system that feel like they could do the same 
thing at the same time - hence my criticism on the way you propose to 
take the lock. Perhaps you already know that this can't be the case, 
but I didn't understand that.

Alas, I cannot currently just "try" this patch. I'm running on an 
heavily instrumented kernel code to track a corruption which does not 
just appear within a couple of hours. All I can say is that the i_sem 
lock in osi_VM_FlushPages() does not harm. I'll pick up the other 
locks at the next opportunity the way you suggest.

With respect to osi_VM_StoreAllSegments(): I'm reasonably sure that 
this does not solve the problem as I'm already able to reproduce it 
in a version of the client code that bypasses the VM layer completely.


-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985       Fax: +41 22 767 7155