[OpenAFS-devel] openafs can crash in linux symlink code on kernels prior to 2.6.13?(RT #56542)

Christopher Allen Wing wingc@engin.umich.edu
Wed, 21 Mar 2007 11:57:20 -0400 (EDT)


[ following up ]

I can reproduce the bug now:

 	1. Create a bunch of symlinks somewhere in AFS
 	2. Run code which follows these symlinks in a tight loop
 	3. Simultaneously run 'fs flushvol .' inside the volume containing
 	   the symlinks.


The kernel panics in page_put_link().  I believe the crash is triggered 
by:

 	PFlushVolumeData()
 		->afs_TryToSmush()
 			->osi_VM_TryToSmush()
 				->invalidate_inode_pages()

and invalidate_inode_pages() wipes out the page cache entries which 
contain the cached symlink text.  The broken page_*symlink() API in early 
2.6 kernels (and also 2.4?) then dies because it cannot handle this 
happening in the middle of a path lookup.

I'll work up a patch today.  Here's an autoconf test for the "new" Linux 
symlink API:


diff -uNr openafs-stable-1_4_x.orig/acinclude.m4 openafs-stable-1_4_x/acinclude.m4
--- openafs-stable-1_4_x.orig/acinclude.m4	2007-02-22 16:48:58.000000000 -0500
+++ openafs-stable-1_4_x/acinclude.m4	2007-03-20 22:28:15.000000000 -0400
@@ -606,6 +606,7 @@
  	  	 LINUX_IOP_I_CREATE_TAKES_NAMEIDATA
  	  	 LINUX_IOP_I_LOOKUP_TAKES_NAMEIDATA
  	  	 LINUX_IOP_I_PERMISSION_TAKES_NAMEIDATA
+	  	 LINUX_IOP_I_PUT_LINK_TAKES_COOKIE
  	  	 LINUX_DOP_D_REVALIDATE_TAKES_NAMEIDATA
  	  	 LINUX_AOP_WRITEBACK_CONTROL
  		 LINUX_FS_STRUCT_FOP_HAS_FLOCK
@@ -833,6 +834,9 @@
  		 if test "x$ac_cv_linux_func_i_permission_takes_nameidata" = "xyes" ; then
  		  AC_DEFINE(IOP_PERMISSION_TAKES_NAMEIDATA, 1, [define if your iops.permission takes a nameidata argument])
  		 fi
+		 if test "x$ac_cv_linux_func_i_put_link_takes_cookie" = "xyes" ; then
+		  AC_DEFINE(IOP_PUT_LINK_TAKES_COOKIE, 1, [define if your iops.put_link takes an opaque cookie])
+		 fi
  		 if test "x$ac_cv_linux_func_d_revalidate_takes_nameidata" = "xyes" ; then
  		  AC_DEFINE(DOP_REVALIDATE_TAKES_NAMEIDATA, 1, [define if your dops.d_revalidate takes a nameidata argument])
  		 fi
diff -uNr openafs-stable-1_4_x.orig/src/cf/linux-test4.m4 openafs-stable-1_4_x/src/cf/linux-test4.m4
--- openafs-stable-1_4_x.orig/src/cf/linux-test4.m4	2007-02-26 12:53:33.000000000 -0500
+++ openafs-stable-1_4_x/src/cf/linux-test4.m4	2007-03-20 22:20:06.000000000 -0400
@@ -644,6 +644,22 @@
    AC_MSG_RESULT($ac_cv_linux_func_i_permission_takes_nameidata)])


+AC_DEFUN([LINUX_IOP_I_PUT_LINK_TAKES_COOKIE], [
+  AC_MSG_CHECKING([whether inode_operations.put_link takes an opaque cookie])
+  AC_CACHE_VAL([ac_cv_linux_func_i_put_link_takes_cookie], [
+    AC_TRY_KBUILD(
+[#include <linux/fs.h>
+#include <linux/namei.h>],
+[struct inode _inode;
+struct dentry _dentry;
+struct nameidata _nameidata;
+void *cookie;
+(void)_inode.i_op->put_link(&_dentry, &_nameidata, cookie);],
+      ac_cv_linux_func_i_put_link_takes_cookie=yes,
+      ac_cv_linux_func_i_put_link_takes_cookie=no)])
+  AC_MSG_RESULT($ac_cv_linux_func_i_put_link_takes_cookie)])
+
+
  AC_DEFUN([LINUX_DOP_D_REVALIDATE_TAKES_NAMEIDATA], [
    AC_MSG_CHECKING([whether dentry_operations.d_revalidate takes a nameidata])
    AC_CACHE_VAL([ac_cv_linux_func_d_revalidate_takes_nameidata], [





On Fri, 16 Mar 2007, Christopher Allen Wing wrote:

> I opened up a bug ticket for this (RT #56542); I am sending it to 
> openafs-devel because I'd appreciate some more eyes on this to confirm that 
> my analysis is correct here.
>
> ----------------
> We recently saw a kernel crash on a machine running RHEL4, which hit the 
> following assert in (linux)/fs/namei.c:
>
> 	void page_put_link(struct dentry *dentry, struct nameidata *nd)
> 	{
> 	if (!IS_ERR(nd_get_link(nd))) {
> 		struct page *page;
> 		page = find_get_page(dentry->d_inode->i_mapping, 0);
> 		if (!page)
> 			BUG();