[OpenAFS] Re: RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

Kodiak Firesmith kfiresmith@gmail.com
Thu, 1 Feb 2018 11:21:59 -0500


--f403045c22c4c8afe4056428fe08
Content-Type: text/plain; charset="UTF-8"

Thanks for the replies!

We're using DKMS and expected the dynamic re-roll of the kmods to work like
any other kernel upgrade but that doesn't seem to be the case.  I need to
dig deeper, especially now that there is evidence that it's just our site.

Thanks a bunch everyone.

 - Kodiak

On Thu, Feb 1, 2018 at 11:13 AM, Matt Vander Werf <mvanderw@nd.edu> wrote:

> I'm also seeing the same issue as Gary on some RHEL 7.5 beta boxes running
> OpenAFS 1.6.22.1. Can't run ls under any /afs/.../.../etc directory,
> including in my AFS home directory when logged in as myself.
>
> [mvanderw@<host> ~]$ ls
> ls: reading directory .: Not a directory
> [mvanderw@<host> ~]$ ls ~
> ls: reading directory /afs/crc.nd.edu/user/m/mvanderw: Not a directory
>
> [mvanderw@<host> ~]$ ls /afs/
> ls: reading directory /afs/: Not a directory
> [mvanderw@<host> ~]$ ls /afs/crc.nd.edu
> ls: reading directory /afs/crc.nd.edu: Not a directory
>
> But no kernel panics here either.
>
> @Kodiak: Is it possible you were running a kmod-openafs from an older
> kernel? I compiled a new kmod-openafs RPM on a RHEL 7.5 beta system and it
> works well.
>
> I compiled all the OpenAFS packages from the source RPM on the RHEL 7.5
> beta system itself and didn't run into any issues with the compile.
>
> Besides this, AFS seems to be running correctly with nothing in the logs
> indicating any problems (like Gary mentioned).
>
> Any idea what might be causing this? Some semantic changes like with the
> getcwd issue in RHEL 7.4?
>
> Thanks.
>
> --
> Matt Vander Werf
> HPC System Administrator
> University of Notre Dame
> Center for Research Computing - Union Station
> 506 W. South Street
> <https://maps.google.com/?q=506+W.+South+Street+South+Bend,+IN+46601&entry=gmail&source=g>
> South Bend, IN 46601
> <https://maps.google.com/?q=506+W.+South+Street+South+Bend,+IN+46601&entry=gmail&source=g>
> Phone: (574) 631-0692
>
> On Thu, Feb 1, 2018 at 10:58 AM, Gary Gatling <gsgatlin@ncsu.edu> wrote:
>
>> Ok. This gets weirder. Any directory under /afs says Not a directory. But
>> I can read files like
>>
>> /afs/eos.ncsu.edu/software/inventory/software_inventory
>>
>> just fine.
>>
>> On Thu, Feb 1, 2018 at 10:55 AM, Gary Gatling <gsgatlin@ncsu.edu> wrote:
>>
>>> I don't get a kernel panic but instead I get:
>>>
>>> [gsgatlin@localhost ~]$ ls /afs/
>>> ls: reading directory /afs/: Not a directory
>>> [gsgatlin@localhost ~]$
>>>
>>>
>>> which is pretty weird. I don't see anything in the syslog about problems
>>> with openafs
>>>
>>> Feb  1 10:44:24 localhost systemd: Starting OpenAFS Client Service...
>>> Feb  1 10:44:24 localhost kernel: libafs: loading out-of-tree module
>>> taints kernel.
>>> Feb  1 10:44:24 localhost kernel: libafs: module license '
>>> http://www.openafs.org/dl/license10.html' taints kernel.
>>> Feb  1 10:44:24 localhost kernel: Disabling lock debugging due to kernel
>>> taint
>>> Feb  1 10:44:24 localhost kernel: libafs: module verification failed:
>>> signature and/or required key missing - tainting kernel
>>> Feb  1 10:44:24 localhost kernel: Key type afs_pag registered
>>> Feb  1 10:44:24 localhost kernel: enabling dynamically allocated vcaches
>>> Feb  1 10:44:24 localhost kernel: Starting AFS cache scan...Memory
>>> cache: Allocating 1600 dcache entries...found 0 non-empty cache files (0%).
>>> Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
>>> Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
>>> Feb  1 10:44:24 localhost systemd: Started OpenAFS Client Service.
>>>
>>> I am using openafs-1.6.22
>>>
>>>
>>> with
>>>
>>> correct-m4-conditionals-in-curses.m4.patch
>>> linux-test-for-vfswrite-rather-than-vfsread.patch
>>> linux-use-kernelread-kernelwrite-when-vfs-varian.patch
>>>
>>> from the arch linux distro in my rpm packages.
>>>
>>> Anyone know what
>>>
>>> ls: reading directory /afs/: Not a directory
>>>
>>> means and is there some way around it?
>>>
>>> Also, is 1.6.22.2 coming out soon?
>>>
>>> Thanks so much,
>>>
>>> On Wed, Jan 31, 2018 at 9:43 AM, Kodiak Firesmith <kfiresmith@gmail.com>
>>> wrote:
>>>
>>>> https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3
>>>>
>>>>
>>>> On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith <kfiresmith@gmail.com
>>>> > wrote:
>>>>
>>>>> Folks, re-sending this because the first try never hit the list -
>>>>> perhaps mail with attachments are silently dropped or held for manual
>>>>> moderation?  I'd originally attached an image of the stack trace.  I'll
>>>>> host it and reply to this with a  URL link in case that would also result
>>>>> in a drop or moderation.
>>>>>
>>>>>
>>>>>
>>>>> Anyhow:
>>>>>
>>>>> In testing the new RHEL 7.5 beta, we've discovered that hosts using
>>>>> AFS fail to boot after the upgrade, with Openafs 1.6.22.1 installed.
>>>>>
>>>>> We are wondering if some of the non-guaranteed kernel ABIs that
>>>>> OpenAFS uses might have changed with the latest kernel provided in RHEL 7.
>>>>>
>>>>> I've attached a picture of the trace.
>>>>>
>>>>> Anyone else kicking the tires on the new RHEL yet?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>
>>>
>>
>

--f403045c22c4c8afe4056428fe08
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks for the replies!=C2=A0<div><br></div><div>We&#39;re=
 using DKMS and expected the dynamic re-roll of the kmods to work like any =
other kernel upgrade but that doesn&#39;t seem to be the case.=C2=A0 I need=
 to dig deeper, especially now that there is evidence that it&#39;s just ou=
r site.=C2=A0=C2=A0</div><div><br></div><div>Thanks a bunch everyone.</div>=
<div><br></div><div>=C2=A0- Kodiak</div></div><div class=3D"gmail_extra"><b=
r><div class=3D"gmail_quote">On Thu, Feb 1, 2018 at 11:13 AM, Matt Vander W=
erf <span dir=3D"ltr">&lt;<a href=3D"mailto:mvanderw@nd.edu" target=3D"_bla=
nk">mvanderw@nd.edu</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
><div dir=3D"ltr"><div><div>I&#39;m also seeing the same issue as Gary on s=
ome RHEL 7.5 beta boxes running OpenAFS 1.6.22.1. Can&#39;t run ls under an=
y /afs/.../.../etc directory, including in my AFS home directory when logge=
d in as myself.<br><br>[mvanderw@&lt;host&gt; ~]$ ls<br>ls: reading directo=
ry .: Not a directory<br>[mvanderw@&lt;host&gt; ~]$ ls ~<br>ls: reading dir=
ectory /afs/<a href=3D"http://crc.nd.edu/user/m/mvanderw" target=3D"_blank"=
>crc.nd.edu/user/m/<wbr>mvanderw</a>: Not a directory<br><br>[mvanderw@&lt;=
host&gt; ~]$ ls /afs/<span class=3D""><br>ls: reading directory /afs/: Not =
a directory<br></span>[mvanderw@&lt;host&gt; ~]$ ls /afs/<a href=3D"http://=
crc.nd.edu" target=3D"_blank">crc.nd.edu</a><br>ls: reading directory /afs/=
<a href=3D"http://crc.nd.edu" target=3D"_blank">crc.nd.edu</a>: Not a direc=
tory<br><br></div>But no kernel panics here either.<br><br>@Kodiak: Is it p=
ossible you were running a kmod-openafs from an older kernel? I compiled a =
new kmod-openafs RPM on a RHEL 7.5 beta system and it works well.<br><br> I=
 compiled all the OpenAFS packages from the source RPM on the RHEL 7.5
 beta system itself and didn&#39;t run into any issues with the compile. <b=
r><br>Besides this, AFS seems to be running correctly with nothing in the l=
ogs indicating any problems (like Gary mentioned).<br><br></div><div>Any id=
ea what might be causing this? Some semantic changes like with the getcwd i=
ssue in RHEL 7.4?<br><br></div>Thanks.<br><div class=3D"gmail_extra"><br cl=
ear=3D"all"><div><div class=3D"m_8040616389751924413gmail_signature"><div d=
ir=3D"ltr"><div>--<br></div><div>Matt Vander Werf<br>HPC System Administrat=
or<br>University of Notre Dame<br>Center for Research Computing - Union Sta=
tion<br><a href=3D"https://maps.google.com/?q=3D506+W.+South+Street+South+B=
end,+IN+46601&amp;entry=3Dgmail&amp;source=3Dg">506 W. South Street</a><br>=
<a href=3D"https://maps.google.com/?q=3D506+W.+South+Street+South+Bend,+IN+=
46601&amp;entry=3Dgmail&amp;source=3Dg">South Bend, IN 46601</a><br></div>P=
hone: <a href=3D"tel:(574)%20631-0692" value=3D"+15746310692" target=3D"_bl=
ank">(574) 631-0692</a></div></div></div><div><div class=3D"h5">
<br><div class=3D"gmail_quote">On Thu, Feb 1, 2018 at 10:58 AM, Gary Gatlin=
g <span dir=3D"ltr">&lt;<a href=3D"mailto:gsgatlin@ncsu.edu" target=3D"_bla=
nk">gsgatlin@ncsu.edu</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2=
04);padding-left:1ex"><div dir=3D"ltr">Ok. This gets weirder. Any directory=
 under /afs says=C2=A0Not a directory. But I can read files like<div><br></=
div><div>/afs/<a href=3D"http://eos.ncsu.edu/software/inventory/software_in=
ventory" target=3D"_blank">eos.ncsu.edu/software/inv<wbr>entory/software_in=
ventory</a><br></div><div><br></div><div>just fine.=C2=A0</div></div><div c=
lass=3D"m_8040616389751924413gmail-HOEnZb"><div class=3D"m_8040616389751924=
413gmail-h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On T=
hu, Feb 1, 2018 at 10:55 AM, Gary Gatling <span dir=3D"ltr">&lt;<a href=3D"=
mailto:gsgatlin@ncsu.edu" target=3D"_blank">gsgatlin@ncsu.edu</a>&gt;</span=
> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"l=
tr">I don&#39;t get a kernel panic but instead I get:<div><br></div><div><d=
iv>[gsgatlin@localhost ~]$ ls /afs/</div><div>ls: reading directory /afs/: =
Not a directory</div><div>[gsgatlin@localhost ~]$=C2=A0</div></div><div><br=
></div><div><br></div><div>which is pretty weird. I don&#39;t see anything =
in the syslog about problems with openafs</div><div><br></div><div><div>Feb=
=C2=A0 1 10:44:24 localhost systemd: Starting OpenAFS Client Service...</di=
v><div>Feb=C2=A0 1 10:44:24 localhost kernel: libafs: loading out-of-tree m=
odule taints kernel.</div><div>Feb=C2=A0 1 10:44:24 localhost kernel: libaf=
s: module license &#39;<a href=3D"http://www.openafs.org/dl/license10.html"=
 target=3D"_blank">http://www.openafs.org/dl/lic<wbr>ense10.html</a>&#39; t=
aints kernel.</div><div>Feb=C2=A0 1 10:44:24 localhost kernel: Disabling lo=
ck debugging due to kernel taint</div><div>Feb=C2=A0 1 10:44:24 localhost k=
ernel: libafs: module verification failed: signature and/or required key mi=
ssing - tainting kernel</div><div>Feb=C2=A0 1 10:44:24 localhost kernel: Ke=
y type afs_pag registered</div><div>Feb=C2=A0 1 10:44:24 localhost kernel: =
enabling dynamically allocated vcaches</div><div>Feb=C2=A0 1 10:44:24 local=
host kernel: Starting AFS cache scan...Memory cache: Allocating 1600 dcache=
 entries...found 0 non-empty cache files (0%).</div><div>Feb=C2=A0 1 10:44:=
24 localhost afsd: afsd: All AFS daemons started.</div><div>Feb=C2=A0 1 10:=
44:24 localhost afsd: afsd: All AFS daemons started.</div><div>Feb=C2=A0 1 =
10:44:24 localhost systemd: Started OpenAFS Client Service.</div><div><br><=
/div></div><div>I am using=C2=A0openafs-1.6.22</div><div><br></div><div><br=
></div><div>with</div><div><br></div><div>correct-m4-conditionals-in-cur<wb=
r>ses.m4.patch<br></div><div>linux-test-for-vfswrite-rather<wbr>-than-vfsre=
ad.patch<br></div><div>linux-use-kernelread-kernelwri<wbr>te-when-vfs-varia=
n.patch<br></div><div><br></div><div>from the arch linux distro in my rpm p=
ackages.</div><div><br></div><div>Anyone know what=C2=A0</div><div><br></di=
v><div>ls: reading directory /afs/: Not a directory<br></div><div><br></div=
><div>means and is there some way around it?</div><div><br></div><div>Also,=
 is 1.6.22.2 coming out soon?</div><div><br></div><div>Thanks so much,</div=
></div><div class=3D"m_8040616389751924413gmail-m_3955318918189791478HOEnZb=
"><div class=3D"m_8040616389751924413gmail-m_3955318918189791478h5"><div cl=
ass=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Jan 31, 2018 at =
9:43 AM, Kodiak Firesmith <span dir=3D"ltr">&lt;<a href=3D"mailto:kfiresmit=
h@gmail.com" target=3D"_blank">kfiresmith@gmail.com</a>&gt;</span> wrote:<b=
r><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde=
r-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><a hre=
f=3D"https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3" target=3D"_blank">https:/=
/photos.app.goo.gl/WgPs<wbr>SUCLK5ojxIuH3</a><br><div><br></div></div><div =
class=3D"m_8040616389751924413gmail-m_3955318918189791478m_4185175457039224=
222HOEnZb"><div class=3D"m_8040616389751924413gmail-m_3955318918189791478m_=
4185175457039224222h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_q=
uote">On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith <span dir=3D"ltr">&=
lt;<a href=3D"mailto:kfiresmith@gmail.com" target=3D"_blank">kfiresmith@gma=
il.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"=
margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-lef=
t:1ex"><div dir=3D"ltr">Folks, re-sending this because the first try never =
hit the list - perhaps mail with attachments are silently dropped or held f=
or manual moderation?=C2=A0 I&#39;d originally attached an image of the sta=
ck trace.=C2=A0 I&#39;ll host it and reply to this with a=C2=A0 URL link in=
 case that would also result in a drop or moderation.<div><br></div><div><b=
r></div><div><br></div><div>Anyhow:=C2=A0=C2=A0</div><div><br></div><div><d=
iv style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8=
px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal=
;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;tex=
t-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial">In=
 testing the new RHEL 7.5 beta, we&#39;ve discovered that hosts using AFS f=
ail to boot after the upgrade, with Openafs 1.6.22.1 installed.=C2=A0=C2=A0=
</div><div style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-s=
ize:12.8px;font-style:normal;font-variant-ligatures:normal;font-variant-cap=
s:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent=
:0px;text-transform:none;white-space:normal;word-spacing:0px;background-col=
or:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:ini=
tial"><br></div><div style=3D"color:rgb(34,34,34);font-family:arial,sans-se=
rif;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;t=
ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;back=
ground-color:rgb(255,255,255);text-decoration-style:initial;text-decoration=
-color:initial">We are wondering if some of the non-guaranteed kernel ABIs =
that OpenAFS uses might have changed with the latest kernel provided in RHE=
L 7.=C2=A0=C2=A0</div><div style=3D"color:rgb(34,34,34);font-family:arial,s=
ans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;=
font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:s=
tart;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0p=
x;background-color:rgb(255,255,255);text-decoration-style:initial;text-deco=
ration-color:initial"><br></div><div style=3D"color:rgb(34,34,34);font-fami=
ly:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatur=
es:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;te=
xt-align:start;text-indent:0px;text-transform:none;white-space:normal;word-=
spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial=
;text-decoration-color:initial">I&#39;ve attached a picture of the trace.</=
div><div style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-siz=
e:12.8px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:=
normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0=
px;text-transform:none;white-space:normal;word-spacing:0px;background-color=
:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initi=
al"><br></div><div style=3D"color:rgb(34,34,34);font-family:arial,sans-seri=
f;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;font-var=
iant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;tex=
t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backgr=
ound-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-c=
olor:initial">Anyone else kicking the tires on the new RHEL yet?</div><div =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-t=
ransform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,=
255,255);text-decoration-style:initial;text-decoration-color:initial"><br><=
/div><div style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-si=
ze:12.8px;font-style:normal;font-variant-ligatures:normal;font-variant-caps=
:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:=
0px;text-transform:none;white-space:normal;word-spacing:0px;background-colo=
r:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:init=
ial">Thanks!</div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>

--f403045c22c4c8afe4056428fe08--