[OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

Stephan Wiesand stephan.wiesand@desy.de
Mon, 5 Feb 2018 18:31:02 +0100


> On 04.Feb 2018, at 02:11, Jeffrey Altman <jaltman@auristor.com> wrote:
> 
> On 2/2/2018 6:04 PM, Kodiak Firesmith wrote:
>> I'm relatively new to handling OpenAFS.  Are these problems part of a
>> normal "kernel release; openafs update" cycle and perhaps I'm getting
>> snagged just by being too early of an adopter?  I wanted to raise the
>> alarm on this and see if anything else was needed from me as the
>> reporter of the issue, but perhaps that's an overreaction to what is
>> just part of a normal process I just haven't been tuned into in prior
>> RHEL release cycles?
> 
> 
> Kodiak,
> 
> On RHEL, DKMS is safe to use for kernel modules that restrict themselves
> to using the restricted set of kernel interfaces (the RHEL KABI) that
> Red Hat has designated will be supported across the lifespan of the RHEL
> major version number.  OpenAFS is not such a kernel module.  As a result
> it is vulnerable to breakage each and every time a new kernel is shipped.

Jeffrey,

the usual way to use DKMS is to either have it build a module for a newly
installed kernel or install a prebuilt module for that kernel. It may be
possible to abuse it for providing a module built for another kernel, but
I think that won't happen accidentally.

You may be confusing DKMS with RHEL's "KABI tracking kmods". Those should
be safe to use within a RHEL minor release (and the SL packaging has been
using them like this since EL6.4), but aren't across minor releases (and
that's why the SL packaging modifies the kmod handling to require a build
for the minor release in question.

> There are two types of failures that can occur:
> 
> 1. a change results in failure to build the OpenAFS kernel module
>    for the new kernel
> 
> 2. a change results in the OpenAFS kernel module building and
>    successfully loading but failing to operate correctly

The latter shouldn't happen within a minor release, but can across
minor releases.

> It is the second of these possibilities that has taken place with the
> release of the 3.10.0-830.el7 kernel shipped as part of the RHEL 7.5 beta.
> 
> Are you an early adopter of RHEL 7.5 beta?  Absolutely, its a beta
> release and as such you should expect that there will be bugs and that
> third party kernel modules that do not adhere to the KABI functionality
> might have compatibility issues.

The -830 kernel can break 3rd-party modules using non-whitelisted ABIs,
whether or not they adhere to the "KABI functionality".

> There was a compatibility issue with RHEL 7.4 kernel
> (3.10.0_693.1.1.el7) as well that was only fixed in the OpenAFS 1.6
> release series this past week as part of 1.6.22.2:
> 
>  http://www.openafs.org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2

Yes, and this one was hard to fix. Thanks are due to Mark Vitale for
developing the fix and all those who reviewed and tested it.

> Jeffrey Altman
> AuriStor, Inc.
> 
> P.S. - Welcome to the community.

Seconded. In particular, the problem report regarding the EL7.5beta
kernel was absolutely appropriate.

-- 
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany