[OpenAFS-devel] OpenAFS, Linux and truncate_inode_pages()
Rainer Toebbicke
rtb@pclella.cern.ch
Tue, 21 Feb 2006 11:25:12 +0100
In Linux kernel 2.6 the comments for truncate_inode_pages() say you
need to hold the i_sem lock before to call this, in kernel 2.4 there
is no mention but tracing back things a bit you find that de facto the
i_sem is always "downed" when truncate_inode_pages() is called.
In the AFS client code however this funtion is called in many places,
directly or through wrappers, and only osi_UFSTruncate() cares about
the i_sem.
One of the prominent occasions where this looks particular careless is
in afs_linux_read() (osi_FlushPages) prior to calling
generic_file_read(). With a printf() in osi_VM_FlushPages and a little
mickey-mousing you can show that at least through this code path
truncate_inode_pages() is called without the i_sem lock. My local
Linux guru frowns at this.
As long as there are sufficient AFS locks it might be ok to ignore the
i_sem lock for AFS inodes as nobody else is likely to mess around with
them, in the afs_linux_read case however all AFS locks which come
immediately to my mind (global lock, vcache lock) are dropped and
leave truncate_inode_pages() exposed as far as I can see.
The trivial fix will be to hack a down/up i_sem around this particular
instance, but I wonder whether anybody has given the locking some
deeper thought already?
Introducing a down/up wrap "down" in osi_VM_FlushPages requires
checking all levels through which this called to avoid a deadlock on
the i_sem. Similar suspects like osi_VM_Truncate and
osi_VM_FlushVCache have the same problem - a fast growing tree to
trace back.
(the background: I can show that for sufficiently large N and
concurrency the script "for i (1..N) { cp file.i file.i+1; cmp file.i
file.i+1 || die}" fails).
Comments?
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155