[OpenAFS] Re: RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

Matt Vander Werf mvanderw@nd.edu
Thu, 1 Feb 2018 11:13:31 -0500


--f4f5e807fe5005fc6d056428e3bf
Content-Type: text/plain; charset="UTF-8"

I'm also seeing the same issue as Gary on some RHEL 7.5 beta boxes running
OpenAFS 1.6.22.1. Can't run ls under any /afs/.../.../etc directory,
including in my AFS home directory when logged in as myself.

[mvanderw@<host> ~]$ ls
ls: reading directory .: Not a directory
[mvanderw@<host> ~]$ ls ~
ls: reading directory /afs/crc.nd.edu/user/m/mvanderw: Not a directory

[mvanderw@<host> ~]$ ls /afs/
ls: reading directory /afs/: Not a directory
[mvanderw@<host> ~]$ ls /afs/crc.nd.edu
ls: reading directory /afs/crc.nd.edu: Not a directory

But no kernel panics here either.

@Kodiak: Is it possible you were running a kmod-openafs from an older
kernel? I compiled a new kmod-openafs RPM on a RHEL 7.5 beta system and it
works well.

I compiled all the OpenAFS packages from the source RPM on the RHEL 7.5
beta system itself and didn't run into any issues with the compile.

Besides this, AFS seems to be running correctly with nothing in the logs
indicating any problems (like Gary mentioned).

Any idea what might be causing this? Some semantic changes like with the
getcwd issue in RHEL 7.4?

Thanks.

--
Matt Vander Werf
HPC System Administrator
University of Notre Dame
Center for Research Computing - Union Station
506 W. South Street
South Bend, IN 46601
Phone: (574) 631-0692

On Thu, Feb 1, 2018 at 10:58 AM, Gary Gatling <gsgatlin@ncsu.edu> wrote:

> Ok. This gets weirder. Any directory under /afs says Not a directory. But
> I can read files like
>
> /afs/eos.ncsu.edu/software/inventory/software_inventory
>
> just fine.
>
> On Thu, Feb 1, 2018 at 10:55 AM, Gary Gatling <gsgatlin@ncsu.edu> wrote:
>
>> I don't get a kernel panic but instead I get:
>>
>> [gsgatlin@localhost ~]$ ls /afs/
>> ls: reading directory /afs/: Not a directory
>> [gsgatlin@localhost ~]$
>>
>>
>> which is pretty weird. I don't see anything in the syslog about problems
>> with openafs
>>
>> Feb  1 10:44:24 localhost systemd: Starting OpenAFS Client Service...
>> Feb  1 10:44:24 localhost kernel: libafs: loading out-of-tree module
>> taints kernel.
>> Feb  1 10:44:24 localhost kernel: libafs: module license '
>> http://www.openafs.org/dl/license10.html' taints kernel.
>> Feb  1 10:44:24 localhost kernel: Disabling lock debugging due to kernel
>> taint
>> Feb  1 10:44:24 localhost kernel: libafs: module verification failed:
>> signature and/or required key missing - tainting kernel
>> Feb  1 10:44:24 localhost kernel: Key type afs_pag registered
>> Feb  1 10:44:24 localhost kernel: enabling dynamically allocated vcaches
>> Feb  1 10:44:24 localhost kernel: Starting AFS cache scan...Memory cache:
>> Allocating 1600 dcache entries...found 0 non-empty cache files (0%).
>> Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
>> Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
>> Feb  1 10:44:24 localhost systemd: Started OpenAFS Client Service.
>>
>> I am using openafs-1.6.22
>>
>>
>> with
>>
>> correct-m4-conditionals-in-curses.m4.patch
>> linux-test-for-vfswrite-rather-than-vfsread.patch
>> linux-use-kernelread-kernelwrite-when-vfs-varian.patch
>>
>> from the arch linux distro in my rpm packages.
>>
>> Anyone know what
>>
>> ls: reading directory /afs/: Not a directory
>>
>> means and is there some way around it?
>>
>> Also, is 1.6.22.2 coming out soon?
>>
>> Thanks so much,
>>
>> On Wed, Jan 31, 2018 at 9:43 AM, Kodiak Firesmith <kfiresmith@gmail.com>
>> wrote:
>>
>>> https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3
>>>
>>>
>>> On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith <kfiresmith@gmail.com>
>>> wrote:
>>>
>>>> Folks, re-sending this because the first try never hit the list -
>>>> perhaps mail with attachments are silently dropped or held for manual
>>>> moderation?  I'd originally attached an image of the stack trace.  I'll
>>>> host it and reply to this with a  URL link in case that would also result
>>>> in a drop or moderation.
>>>>
>>>>
>>>>
>>>> Anyhow:
>>>>
>>>> In testing the new RHEL 7.5 beta, we've discovered that hosts using AFS
>>>> fail to boot after the upgrade, with Openafs 1.6.22.1 installed.
>>>>
>>>> We are wondering if some of the non-guaranteed kernel ABIs that OpenAFS
>>>> uses might have changed with the latest kernel provided in RHEL 7.
>>>>
>>>> I've attached a picture of the trace.
>>>>
>>>> Anyone else kicking the tires on the new RHEL yet?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>
>>
>

--f4f5e807fe5005fc6d056428e3bf
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div>I&#39;m also seeing the same issue as Gary on so=
me RHEL 7.5 beta boxes running OpenAFS 1.6.22.1. Can&#39;t run ls under any=
 /afs/.../.../etc directory, including in my AFS home directory when logged=
 in as myself.<br><br>[mvanderw@&lt;host&gt; ~]$ ls<br>ls: reading director=
y .: Not a directory<br>[mvanderw@&lt;host&gt; ~]$ ls ~<br>ls: reading dire=
ctory /afs/<a href=3D"http://crc.nd.edu/user/m/mvanderw">crc.nd.edu/user/m/=
mvanderw</a>: Not a directory<br><br>[mvanderw@&lt;host&gt; ~]$ ls /afs/<br=
>ls: reading directory /afs/: Not a directory<br>[mvanderw@&lt;host&gt; ~]$=
 ls /afs/<a href=3D"http://crc.nd.edu">crc.nd.edu</a><br>ls: reading direct=
ory /afs/<a href=3D"http://crc.nd.edu">crc.nd.edu</a>: Not a directory<br><=
br></div>But no kernel panics here either.<br><br>@Kodiak: Is it possible y=
ou were running a kmod-openafs from an older kernel? I compiled a new kmod-=
openafs RPM on a RHEL 7.5 beta system and it works well.<br><br> I compiled=
 all the OpenAFS packages from the source RPM on the RHEL 7.5
 beta system itself and didn&#39;t run into any issues with the compile. <b=
r><br>Besides this, AFS seems to be running correctly with nothing in the l=
ogs indicating any problems (like Gary mentioned).<br><br></div><div>Any id=
ea what might be causing this? Some semantic changes like with the getcwd i=
ssue in RHEL 7.4?<br><br></div>Thanks.<br><div class=3D"gmail_extra"><br cl=
ear=3D"all"><div><div class=3D"gmail_signature"><div dir=3D"ltr"><div>--<br=
></div><div>Matt Vander Werf<br>HPC System Administrator<br>University of N=
otre Dame<br>Center for Research Computing - Union Station<br>506 W. South =
Street<br>South Bend, IN 46601<br></div>Phone: (574) 631-0692</div></div></=
div>
<br><div class=3D"gmail_quote">On Thu, Feb 1, 2018 at 10:58 AM, Gary Gatlin=
g <span dir=3D"ltr">&lt;<a href=3D"mailto:gsgatlin@ncsu.edu" target=3D"_bla=
nk">gsgatlin@ncsu.edu</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2=
04);padding-left:1ex"><div dir=3D"ltr">Ok. This gets weirder. Any directory=
 under /afs says=C2=A0Not a directory. But I can read files like<div><br></=
div><div>/afs/<a href=3D"http://eos.ncsu.edu/software/inventory/software_in=
ventory" target=3D"_blank">eos.ncsu.edu/software/<wbr>inventory/software_in=
ventory</a><br></div><div><br></div><div>just fine.=C2=A0</div></div><div c=
lass=3D"gmail-HOEnZb"><div class=3D"gmail-h5"><div class=3D"gmail_extra"><b=
r><div class=3D"gmail_quote">On Thu, Feb 1, 2018 at 10:55 AM, Gary Gatling =
<span dir=3D"ltr">&lt;<a href=3D"mailto:gsgatlin@ncsu.edu" target=3D"_blank=
">gsgatlin@ncsu.edu</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204=
);padding-left:1ex"><div dir=3D"ltr">I don&#39;t get a kernel panic but ins=
tead I get:<div><br></div><div><div>[gsgatlin@localhost ~]$ ls /afs/</div><=
div>ls: reading directory /afs/: Not a directory</div><div>[gsgatlin@localh=
ost ~]$=C2=A0</div></div><div><br></div><div><br></div><div>which is pretty=
 weird. I don&#39;t see anything in the syslog about problems with openafs<=
/div><div><br></div><div><div>Feb=C2=A0 1 10:44:24 localhost systemd: Start=
ing OpenAFS Client Service...</div><div>Feb=C2=A0 1 10:44:24 localhost kern=
el: libafs: loading out-of-tree module taints kernel.</div><div>Feb=C2=A0 1=
 10:44:24 localhost kernel: libafs: module license &#39;<a href=3D"http://w=
ww.openafs.org/dl/license10.html" target=3D"_blank">http://www.openafs.org/=
dl/lic<wbr>ense10.html</a>&#39; taints kernel.</div><div>Feb=C2=A0 1 10:44:=
24 localhost kernel: Disabling lock debugging due to kernel taint</div><div=
>Feb=C2=A0 1 10:44:24 localhost kernel: libafs: module verification failed:=
 signature and/or required key missing - tainting kernel</div><div>Feb=C2=
=A0 1 10:44:24 localhost kernel: Key type afs_pag registered</div><div>Feb=
=C2=A0 1 10:44:24 localhost kernel: enabling dynamically allocated vcaches<=
/div><div>Feb=C2=A0 1 10:44:24 localhost kernel: Starting AFS cache scan...=
Memory cache: Allocating 1600 dcache entries...found 0 non-empty cache file=
s (0%).</div><div>Feb=C2=A0 1 10:44:24 localhost afsd: afsd: All AFS daemon=
s started.</div><div>Feb=C2=A0 1 10:44:24 localhost afsd: afsd: All AFS dae=
mons started.</div><div>Feb=C2=A0 1 10:44:24 localhost systemd: Started Ope=
nAFS Client Service.</div><div><br></div></div><div>I am using=C2=A0openafs=
-1.6.22</div><div><br></div><div><br></div><div>with</div><div><br></div><d=
iv>correct-m4-conditionals-in-cur<wbr>ses.m4.patch<br></div><div>linux-test=
-for-vfswrite-rather<wbr>-than-vfsread.patch<br></div><div>linux-use-kernel=
read-kernelwri<wbr>te-when-vfs-varian.patch<br></div><div><br></div><div>fr=
om the arch linux distro in my rpm packages.</div><div><br></div><div>Anyon=
e know what=C2=A0</div><div><br></div><div>ls: reading directory /afs/: Not=
 a directory<br></div><div><br></div><div>means and is there some way aroun=
d it?</div><div><br></div><div>Also, is 1.6.22.2 coming out soon?</div><div=
><br></div><div>Thanks so much,</div></div><div class=3D"gmail-m_3955318918=
189791478HOEnZb"><div class=3D"gmail-m_3955318918189791478h5"><div class=3D=
"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Jan 31, 2018 at 9:43 A=
M, Kodiak Firesmith <span dir=3D"ltr">&lt;<a href=3D"mailto:kfiresmith@gmai=
l.com" target=3D"_blank">kfiresmith@gmail.com</a>&gt;</span> wrote:<br><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><a href=3D"h=
ttps://photos.app.goo.gl/WgPsSUCLK5ojxIuH3" target=3D"_blank">https://photo=
s.app.goo.gl/WgPs<wbr>SUCLK5ojxIuH3</a><br><div><br></div></div><div class=
=3D"gmail-m_3955318918189791478m_4185175457039224222HOEnZb"><div class=3D"g=
mail-m_3955318918189791478m_4185175457039224222h5"><div class=3D"gmail_extr=
a"><br><div class=3D"gmail_quote">On Wed, Jan 31, 2018 at 9:41 AM, Kodiak F=
iresmith <span dir=3D"ltr">&lt;<a href=3D"mailto:kfiresmith@gmail.com" targ=
et=3D"_blank">kfiresmith@gmail.com</a>&gt;</span> wrote:<br><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Folks, re-sending this =
because the first try never hit the list - perhaps mail with attachments ar=
e silently dropped or held for manual moderation?=C2=A0 I&#39;d originally =
attached an image of the stack trace.=C2=A0 I&#39;ll host it and reply to t=
his with a=C2=A0 URL link in case that would also result in a drop or moder=
ation.<div><br></div><div><br></div><div><br></div><div>Anyhow:=C2=A0=C2=A0=
</div><div><br></div><div><div style=3D"color:rgb(34,34,34);font-family:ari=
al,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:nor=
mal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-ali=
gn:start;text-indent:0px;text-transform:none;white-space:normal;word-spacin=
g:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-=
decoration-color:initial">In testing the new RHEL 7.5 beta, we&#39;ve disco=
vered that hosts using AFS fail to boot after the upgrade, with Openafs 1.6=
.22.1 installed.=C2=A0=C2=A0</div><div style=3D"color:rgb(34,34,34);font-fa=
mily:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;=
text-align:start;text-indent:0px;text-transform:none;white-space:normal;wor=
d-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initi=
al;text-decoration-color:initial"><br></div><div style=3D"color:rgb(34,34,3=
4);font-family:arial,sans-serif;font-size:12.8px;font-style:normal;font-var=
iant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spaci=
ng:normal;text-align:start;text-indent:0px;text-transform:none;white-space:=
normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-s=
tyle:initial;text-decoration-color:initial">We are wondering if some of the=
 non-guaranteed kernel ABIs that OpenAFS uses might have changed with the l=
atest kernel provided in RHEL 7.=C2=A0=C2=A0</div><div style=3D"color:rgb(3=
4,34,34);font-family:arial,sans-serif;font-size:12.8px;font-style:normal;fo=
nt-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter=
-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-=
space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decora=
tion-style:initial;text-decoration-color:initial"><br></div><div style=3D"c=
olor:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px;font-style=
:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:=
400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:n=
one;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);t=
ext-decoration-style:initial;text-decoration-color:initial">I&#39;ve attach=
ed a picture of the trace.</div><div style=3D"color:rgb(34,34,34);font-fami=
ly:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatur=
es:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;te=
xt-align:start;text-indent:0px;text-transform:none;white-space:normal;word-=
spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial=
;text-decoration-color:initial"><br></div><div style=3D"color:rgb(34,34,34)=
;font-family:arial,sans-serif;font-size:12.8px;font-style:normal;font-varia=
nt-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing=
:normal;text-align:start;text-indent:0px;text-transform:none;white-space:no=
rmal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-sty=
le:initial;text-decoration-color:initial">Anyone else kicking the tires on =
the new RHEL yet?</div><div style=3D"color:rgb(34,34,34);font-family:arial,=
sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:normal=
;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:=
start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0=
px;background-color:rgb(255,255,255);text-decoration-style:initial;text-dec=
oration-color:initial"><br></div><div style=3D"color:rgb(34,34,34);font-fam=
ily:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatu=
res:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;t=
ext-align:start;text-indent:0px;text-transform:none;white-space:normal;word=
-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initia=
l;text-decoration-color:initial">Thanks!</div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>

--f4f5e807fe5005fc6d056428e3bf--