[OpenAFS-devel] openafs can crash in linux symlink code on
kernels prior to 2.6.13?(RT #56542)
Christopher Allen Wing
wingc@engin.umich.edu
Wed, 21 Mar 2007 11:57:20 -0400 (EDT)
[ following up ]
I can reproduce the bug now:
1. Create a bunch of symlinks somewhere in AFS
2. Run code which follows these symlinks in a tight loop
3. Simultaneously run 'fs flushvol .' inside the volume containing
the symlinks.
The kernel panics in page_put_link(). I believe the crash is triggered
by:
PFlushVolumeData()
->afs_TryToSmush()
->osi_VM_TryToSmush()
->invalidate_inode_pages()
and invalidate_inode_pages() wipes out the page cache entries which
contain the cached symlink text. The broken page_*symlink() API in early
2.6 kernels (and also 2.4?) then dies because it cannot handle this
happening in the middle of a path lookup.
I'll work up a patch today. Here's an autoconf test for the "new" Linux
symlink API:
diff -uNr openafs-stable-1_4_x.orig/acinclude.m4 openafs-stable-1_4_x/acinclude.m4
--- openafs-stable-1_4_x.orig/acinclude.m4 2007-02-22 16:48:58.000000000 -0500
+++ openafs-stable-1_4_x/acinclude.m4 2007-03-20 22:28:15.000000000 -0400
@@ -606,6 +606,7 @@
LINUX_IOP_I_CREATE_TAKES_NAMEIDATA
LINUX_IOP_I_LOOKUP_TAKES_NAMEIDATA
LINUX_IOP_I_PERMISSION_TAKES_NAMEIDATA
+ LINUX_IOP_I_PUT_LINK_TAKES_COOKIE
LINUX_DOP_D_REVALIDATE_TAKES_NAMEIDATA
LINUX_AOP_WRITEBACK_CONTROL
LINUX_FS_STRUCT_FOP_HAS_FLOCK
@@ -833,6 +834,9 @@
if test "x$ac_cv_linux_func_i_permission_takes_nameidata" = "xyes" ; then
AC_DEFINE(IOP_PERMISSION_TAKES_NAMEIDATA, 1, [define if your iops.permission takes a nameidata argument])
fi
+ if test "x$ac_cv_linux_func_i_put_link_takes_cookie" = "xyes" ; then
+ AC_DEFINE(IOP_PUT_LINK_TAKES_COOKIE, 1, [define if your iops.put_link takes an opaque cookie])
+ fi
if test "x$ac_cv_linux_func_d_revalidate_takes_nameidata" = "xyes" ; then
AC_DEFINE(DOP_REVALIDATE_TAKES_NAMEIDATA, 1, [define if your dops.d_revalidate takes a nameidata argument])
fi
diff -uNr openafs-stable-1_4_x.orig/src/cf/linux-test4.m4 openafs-stable-1_4_x/src/cf/linux-test4.m4
--- openafs-stable-1_4_x.orig/src/cf/linux-test4.m4 2007-02-26 12:53:33.000000000 -0500
+++ openafs-stable-1_4_x/src/cf/linux-test4.m4 2007-03-20 22:20:06.000000000 -0400
@@ -644,6 +644,22 @@
AC_MSG_RESULT($ac_cv_linux_func_i_permission_takes_nameidata)])
+AC_DEFUN([LINUX_IOP_I_PUT_LINK_TAKES_COOKIE], [
+ AC_MSG_CHECKING([whether inode_operations.put_link takes an opaque cookie])
+ AC_CACHE_VAL([ac_cv_linux_func_i_put_link_takes_cookie], [
+ AC_TRY_KBUILD(
+[#include <linux/fs.h>
+#include <linux/namei.h>],
+[struct inode _inode;
+struct dentry _dentry;
+struct nameidata _nameidata;
+void *cookie;
+(void)_inode.i_op->put_link(&_dentry, &_nameidata, cookie);],
+ ac_cv_linux_func_i_put_link_takes_cookie=yes,
+ ac_cv_linux_func_i_put_link_takes_cookie=no)])
+ AC_MSG_RESULT($ac_cv_linux_func_i_put_link_takes_cookie)])
+
+
AC_DEFUN([LINUX_DOP_D_REVALIDATE_TAKES_NAMEIDATA], [
AC_MSG_CHECKING([whether dentry_operations.d_revalidate takes a nameidata])
AC_CACHE_VAL([ac_cv_linux_func_d_revalidate_takes_nameidata], [
On Fri, 16 Mar 2007, Christopher Allen Wing wrote:
> I opened up a bug ticket for this (RT #56542); I am sending it to
> openafs-devel because I'd appreciate some more eyes on this to confirm that
> my analysis is correct here.
>
> ----------------
> We recently saw a kernel crash on a machine running RHEL4, which hit the
> following assert in (linux)/fs/namei.c:
>
> void page_put_link(struct dentry *dentry, struct nameidata *nd)
> {
> if (!IS_ERR(nd_get_link(nd))) {
> struct page *page;
> page = find_get_page(dentry->d_inode->i_mapping, 0);
> if (!page)
> BUG();