[OpenAFS] OpenAFS client crashes on RHEL 5.10 and RHEL 6.5

Andrew Deason openafs-info@openafs.org
Fri, 7 Mar 2014 10:03:14 -0600

It has been discovered that the OpenAFS client interacts poorly with a
change to the Linux kernel introduced in Red Hat Enterprise Linux
versions 5.10 and 6.5, which can cause a kernel panic with certain AFS
access patterns. Sites may want to exercise caution when considering
upgrading RHEL systems that are running OpenAFS clients.

While the mechanism that is causing this problem is understood and a
solution is being developed, fixing this issue is not straightforward
and may take some time. In the meantime, without a fix in place, the
following workarounds may help avoid encountering the issue:

 - Avoid using multiple different mountpoints to access the same data,
   since this can confuse the Linux VFS in some situations. This may be
   difficult to guarantee (since /afs/cellname and /afs/.cellname
   usually exist), but it may be possible to reduce such mountpoint
   usage in some scenarios.
 - Avoid running the RHEL 5.10 kernel or the RHEL 6.5 kernel on machines
   with OpenAFS clients. However, these kernel updates contain stability
   and security fixes, so it may not be desirable to avoid upgrading
   such machines running OpenAFS clients.

Note that the RHEL 7 Beta is not affected by this issue, but it is not
known if the final release of RHEL 7 will. No other distributions of
Linux are known to be affected by this (except any that are derived from
RHEL), and the problematic change to the Linux kernel is not in vanilla
upstream Linux kernel releases.

If your site has a support contract with Red Hat, you may wish to
inquire about this issue through your support channel. For reference,
the issue was introduced in RHEL5 in kernel 2.6.18-367.el5 with this

 - [fs] vfs: stop d_splice_alias creating directory aliases (J. Bruce Fields) [785916]

and in RHEL6 in kernel 2.6.32-408.el6 with this change:

 - [fs] vfs: stop d_splice_alias creating directory aliases (J. Bruce Fields) [820446]

For more details and future updates on this issue, see this RT ticket:

Andrew Deason