[OpenAFS] Re: RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

Kodiak Firesmith kfiresmith@gmail.com
Fri, 2 Feb 2018 16:20:59 -0500


--94eb2c199088f2e51b056441497e
Content-Type: text/plain; charset="UTF-8"

Not much else to report today other than expanding my test base out to a
few more RHEL 7.5b hosts, and re-rolled the 1.6.22.1-1 SRPM again, and am
still seeing the same results universally.  Every host fails to boot due to
a kernel panic when it tries to load the openafs DKMS kernel module.

My next move on Monday will be to try an actual kernel-specific kmod
instead of DKMS.  If that works I'll be kind of sad since we've had great
luck with DKMS until now.

 - Kodiak

On Thu, Feb 1, 2018 at 3:26 PM, Kodiak Firesmith <kfiresmith@gmail.com>
wrote:

> I just rebuilt off-the-shelf RPMs based off of http://www.openafs.org/dl/
> openafs/1.6.22.1/openafs-1.6.22.1-1.src.rpm thinking maybe we had some
> historical patch in our build area that might be causing the problem, but
> alas, even the off-the-shelf RPMs cause a full wedge and reboot when
> openafs-client.service starts up.
>
>  - Kodiak
>
> On Thu, Feb 1, 2018 at 1:23 PM, Kodiak Firesmith <kfiresmith@gmail.com>
> wrote:
>
>> Hello Rich!
>> It's a Dell Optiplex 7020 with an Intel i7-4790.
>>
>> Thanks!
>>  - Kodiak
>>
>> On Thu, Feb 1, 2018 at 1:20 PM, Rich Sudlow <rich@nd.edu> wrote:
>>
>>> On 01/31/2018 09:43 AM, Kodiak Firesmith wrote:
>>>
>>>> https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3
>>>>
>>>
>>> Greetings
>>>
>>> What processor..etc is this machine?
>>>
>>> Rich
>>>
>>>
>>>
>>>>
>>>> On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith <kfiresmith@gmail.com
>>>> <mailto:kfiresmith@gmail.com>> wrote:
>>>>
>>>>     Folks, re-sending this because the first try never hit the list -
>>>> perhaps
>>>>     mail with attachments are silently dropped or held for manual
>>>> moderation?     I'd originally attached an image of the stack trace.  I'll
>>>> host it and reply
>>>>     to this with a  URL link in case that would also result in a drop
>>>> or moderation.
>>>>
>>>>
>>>>
>>>>     Anyhow:
>>>>
>>>>     In testing the new RHEL 7.5 beta, we've discovered that hosts using
>>>> AFS fail
>>>>     to boot after the upgrade, with Openafs 1.6.22.1 installed.
>>>>
>>>>     We are wondering if some of the non-guaranteed kernel ABIs that
>>>> OpenAFS uses
>>>>     might have changed with the latest kernel provided in RHEL 7.
>>>>
>>>>     I've attached a picture of the trace.
>>>>
>>>>     Anyone else kicking the tires on the new RHEL yet?
>>>>
>>>>     Thanks!
>>>>
>>>>
>>>>
>>>
>>> --
>>> Rich Sudlow
>>> University of Notre Dame
>>> Center for Research Computing - Union Station
>>> 506 W. South St
>>> South Bend, In 46601
>>>
>>> (574) 631-7258 (office)
>>> (574) 807-1046 (cell)
>>>
>>
>>
>

--94eb2c199088f2e51b056441497e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Not much else to report today other than expanding my test=
 base out to a few more RHEL 7.5b hosts, and re-rolled the 1.6.22.1-1 SRPM =
again, and am still seeing the same results universally.=C2=A0 Every host f=
ails to boot due to a kernel panic when it tries to load the openafs DKMS k=
ernel module.<div><br></div><div>My next move on Monday will be to try an a=
ctual kernel-specific kmod instead of DKMS.=C2=A0 If that works I&#39;ll be=
 kind of sad since we&#39;ve had great luck with DKMS until now.</div><div>=
<br></div><div>=C2=A0- Kodiak</div></div><div class=3D"gmail_extra"><br><di=
v class=3D"gmail_quote">On Thu, Feb 1, 2018 at 3:26 PM, Kodiak Firesmith <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:kfiresmith@gmail.com" target=3D"_blan=
k">kfiresmith@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail=
_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
1ex"><div dir=3D"ltr">I just rebuilt off-the-shelf RPMs based off of=C2=A0<=
a href=3D"http://www.openafs.org/dl/openafs/1.6.22.1/openafs-1.6.22.1-1.src=
.rpm" target=3D"_blank">http://www.openafs.org/dl/<wbr>openafs/1.6.22.1/ope=
nafs-1.6.<wbr>22.1-1.src.rpm</a> thinking maybe we had some historical patc=
h in our build area that might be causing the problem, but alas, even the o=
ff-the-shelf RPMs cause a full wedge and reboot when openafs-client.service=
 starts up.=C2=A0=C2=A0<span class=3D"HOEnZb"><font color=3D"#888888"><div>=
<br></div><div>=C2=A0- Kodiak</div></font></span></div><div class=3D"HOEnZb=
"><div class=3D"h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quot=
e">On Thu, Feb 1, 2018 at 1:23 PM, Kodiak Firesmith <span dir=3D"ltr">&lt;<=
a href=3D"mailto:kfiresmith@gmail.com" target=3D"_blank">kfiresmith@gmail.c=
om</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"marg=
in:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"=
>Hello Rich!<div>It&#39;s a Dell Optiplex 7020 with an Intel i7-4790.</div>=
<div><br></div><div>Thanks!</div><span class=3D"m_-2782659007125659315HOEnZ=
b"><font color=3D"#888888"><div>=C2=A0- Kodiak</div></font></span></div><di=
v class=3D"m_-2782659007125659315HOEnZb"><div class=3D"m_-27826590071256593=
15h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, Feb=
 1, 2018 at 1:20 PM, Rich Sudlow <span dir=3D"ltr">&lt;<a href=3D"mailto:ri=
ch@nd.edu" target=3D"_blank">rich@nd.edu</a>&gt;</span> wrote:<br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">On 01/31/2018 09:43 AM, Kodiak Firesmith wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<a href=3D"https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3" rel=3D"noreferrer" =
target=3D"_blank">https://photos.app.goo.gl/WgPs<wbr>SUCLK5ojxIuH3</a><br>
</blockquote>
<br>
Greetings<br>
<br>
What processor..etc is this machine?<br>
<br>
Rich<span><br>
<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<br>
<br>
On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith &lt;<a href=3D"mailto:kfi=
resmith@gmail.com" target=3D"_blank">kfiresmith@gmail.com</a> &lt;mailto:<a=
 href=3D"mailto:kfiresmith@gmail.com" target=3D"_blank">kfiresmith@gmail.co=
m</a>&gt;&gt; wrote:<br>
<br>
=C2=A0 =C2=A0 Folks, re-sending this because the first try never hit the li=
st - perhaps<br>
=C2=A0 =C2=A0 mail with attachments are silently dropped or held for manual=
 moderation?=C2=A0 =C2=A0 =C2=A0I&#39;d originally attached an image of the=
 stack trace.=C2=A0 I&#39;ll host it and reply<br>
=C2=A0 =C2=A0 to this with a=C2=A0 URL link in case that would also result =
in a drop or moderation.<br>
<br>
<br>
<br>
=C2=A0 =C2=A0 Anyhow:<br>
<br>
=C2=A0 =C2=A0 In testing the new RHEL 7.5 beta, we&#39;ve discovered that h=
osts using AFS fail<br>
=C2=A0 =C2=A0 to boot after the upgrade, with Openafs 1.6.22.1 installed.<b=
r>
<br>
=C2=A0 =C2=A0 We are wondering if some of the non-guaranteed kernel ABIs th=
at OpenAFS uses<br>
=C2=A0 =C2=A0 might have changed with the latest kernel provided in RHEL 7.=
<br>
<br>
=C2=A0 =C2=A0 I&#39;ve attached a picture of the trace.<br>
<br>
=C2=A0 =C2=A0 Anyone else kicking the tires on the new RHEL yet?<br>
<br>
=C2=A0 =C2=A0 Thanks!<br>
<br>
<br>
</blockquote>
<br>
<br></span>
-- <br>
Rich Sudlow<span><br>
University of Notre Dame<br>
Center for Research Computing - Union Station<br>
506 W. South St<br></span>
South Bend, In 46601<br>
<br>
<a href=3D"tel:%28574%29%20631-7258" value=3D"+15746317258" target=3D"_blan=
k">(574) 631-7258</a>=C2=A0(office)<br>
<a href=3D"tel:%28574%29%20807-1046" value=3D"+15748071046" target=3D"_blan=
k">(574) 807-1046</a>=C2=A0(cell)<br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--94eb2c199088f2e51b056441497e--