[OpenAFS] Re: RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

Gary Gatling gsgatlin@ncsu.edu
Thu, 1 Feb 2018 17:00:14 -0500


--94eb2c07464a77f61e05642db8e9
Content-Type: text/plain; charset="UTF-8"

I tried testing a work in progress 1.6.22.2 on rhel 7.5 beta by doing

git clone git://git.openafs.org/openafs.git
cd openafs
git checkout remotes/origin/openafs-stable-1_6_x
HEAD is now at d25c8e8... Make OpenAFS 1.6.22.2


But it seems to have the same problems with directories so I guess further
changes will need to be made to get it to work on rhel 7.5 kernel. Not a
kernel hacker so I'll wait to see what you guys come up with. :)

Thanks,

On Thu, Feb 1, 2018 at 11:11 AM, Stephan Wiesand <stephan.wiesand@desy.de>
wrote:

> Comparing the 1.6.22.2 module builds from the SL packaging, where the kABI
> hashes of the used symbols are stored as a requirement, is seems none of
> those hashes changed between -693 and -830.
>
> There are two differences in the configure results:
>
> -ac_cv_linux_header_sched_signal_h=no
> +ac_cv_linux_header_sched_signal_h=yes
>
> -ac_cv_linux_struct_file_operations_has_iterate=no
> +ac_cv_linux_struct_file_operations_has_iterate=yes
>
> And there's quite a bit of churn in include/linux.fs.h (and some in key.h).
>
> > On 1. Feb 2018, at 16:58, Gary Gatling <gsgatlin@ncsu.edu> wrote:
> >
> > Ok. This gets weirder. Any directory under /afs says Not a directory.
> But I can read files like
> >
> > /afs/eos.ncsu.edu/software/inventory/software_inventory
> >
> > just fine.
> >
> > On Thu, Feb 1, 2018 at 10:55 AM, Gary Gatling <gsgatlin@ncsu.edu> wrote:
> > I don't get a kernel panic but instead I get:
> >
> > [gsgatlin@localhost ~]$ ls /afs/
> > ls: reading directory /afs/: Not a directory
> > [gsgatlin@localhost ~]$
> >
> >
> > which is pretty weird. I don't see anything in the syslog about problems
> with openafs
> >
> > Feb  1 10:44:24 localhost systemd: Starting OpenAFS Client Service...
> > Feb  1 10:44:24 localhost kernel: libafs: loading out-of-tree module
> taints kernel.
> > Feb  1 10:44:24 localhost kernel: libafs: module license '
> http://www.openafs.org/dl/license10.html' taints kernel.
> > Feb  1 10:44:24 localhost kernel: Disabling lock debugging due to kernel
> taint
> > Feb  1 10:44:24 localhost kernel: libafs: module verification failed:
> signature and/or required key missing - tainting kernel
> > Feb  1 10:44:24 localhost kernel: Key type afs_pag registered
> > Feb  1 10:44:24 localhost kernel: enabling dynamically allocated vcaches
> > Feb  1 10:44:24 localhost kernel: Starting AFS cache scan...Memory
> cache: Allocating 1600 dcache entries...found 0 non-empty cache files (0%).
> > Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
> > Feb  1 10:44:24 localhost afsd: afsd: All AFS daemons started.
> > Feb  1 10:44:24 localhost systemd: Started OpenAFS Client Service.
> >
> > I am using openafs-1.6.22
> >
> >
> > with
> >
> > correct-m4-conditionals-in-curses.m4.patch
> > linux-test-for-vfswrite-rather-than-vfsread.patch
> > linux-use-kernelread-kernelwrite-when-vfs-varian.patch
> >
> > from the arch linux distro in my rpm packages.
> >
> > Anyone know what
> >
> > ls: reading directory /afs/: Not a directory
> >
> > means and is there some way around it?
> >
> > Also, is 1.6.22.2 coming out soon?
> >
> > Thanks so much,
> >
> > On Wed, Jan 31, 2018 at 9:43 AM, Kodiak Firesmith <kfiresmith@gmail.com>
> wrote:
> > https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3
> >
> >
> > On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith <kfiresmith@gmail.com>
> wrote:
> > Folks, re-sending this because the first try never hit the list -
> perhaps mail with attachments are silently dropped or held for manual
> moderation?  I'd originally attached an image of the stack trace.  I'll
> host it and reply to this with a  URL link in case that would also result
> in a drop or moderation.
> >
> >
> >
> > Anyhow:
> >
> > In testing the new RHEL 7.5 beta, we've discovered that hosts using AFS
> fail to boot after the upgrade, with Openafs 1.6.22.1 installed.
> >
> > We are wondering if some of the non-guaranteed kernel ABIs that OpenAFS
> uses might have changed with the latest kernel provided in RHEL 7.
> >
> > I've attached a picture of the trace.
> >
> > Anyone else kicking the tires on the new RHEL yet?
> >
> > Thanks!
> >
> >
> >
> >
>
> --
> Stephan Wiesand
> DESY -DV-
> Platanenallee 6
> 15738 Zeuthen, Germany
>
>
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>

--94eb2c07464a77f61e05642db8e9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I tried testing a work in progress 1.6.22.2 on rhel 7.5 be=
ta by doing<div><br></div><div><div>git clone git://<a href=3D"http://git.o=
penafs.org/openafs.git">git.openafs.org/openafs.git</a></div></div><div>cd =
openafs</div><div>git checkout remotes/origin/openafs-stable-1_6_x<br></div=
><div>HEAD is now at d25c8e8... Make OpenAFS 1.6.22.2</div><div><br></div><=
div><br></div><div>But it seems to have the same problems with directories =
so I guess further changes will need to be made to get it to work on rhel 7=
.5 kernel. Not a kernel hacker so I&#39;ll wait to see what you guys come u=
p with. :)</div><div><br></div><div>Thanks,</div></div><div class=3D"gmail_=
extra"><br><div class=3D"gmail_quote">On Thu, Feb 1, 2018 at 11:11 AM, Step=
han Wiesand <span dir=3D"ltr">&lt;<a href=3D"mailto:stephan.wiesand@desy.de=
" target=3D"_blank">stephan.wiesand@desy.de</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex">Comparing the 1.6.22.2 module builds from the SL p=
ackaging, where the kABI hashes of the used symbols are stored as a require=
ment, is seems none of those hashes changed between -693 and -830.<br>
<br>
There are two differences in the configure results:<br>
<br>
-ac_cv_linux_header_sched_<wbr>signal_h=3Dno<br>
+ac_cv_linux_header_sched_<wbr>signal_h=3Dyes<br>
<br>
-ac_cv_linux_struct_file_<wbr>operations_has_iterate=3Dno<br>
+ac_cv_linux_struct_file_<wbr>operations_has_iterate=3Dyes<br>
<br>
And there&#39;s quite a bit of churn in include/linux.fs.h (and some in key=
.h).<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
&gt; On 1. Feb 2018, at 16:58, Gary Gatling &lt;<a href=3D"mailto:gsgatlin@=
ncsu.edu">gsgatlin@ncsu.edu</a>&gt; wrote:<br>
&gt;<br>
&gt; Ok. This gets weirder. Any directory under /afs says Not a directory. =
But I can read files like<br>
&gt;<br>
&gt; /afs/<a href=3D"http://eos.ncsu.edu/software/inventory/software_invent=
ory" rel=3D"noreferrer" target=3D"_blank">eos.ncsu.edu/software/<wbr>invent=
ory/software_inventory</a><br>
&gt;<br>
&gt; just fine.<br>
&gt;<br>
&gt; On Thu, Feb 1, 2018 at 10:55 AM, Gary Gatling &lt;<a href=3D"mailto:gs=
gatlin@ncsu.edu">gsgatlin@ncsu.edu</a>&gt; wrote:<br>
&gt; I don&#39;t get a kernel panic but instead I get:<br>
&gt;<br>
&gt; [gsgatlin@localhost ~]$ ls /afs/<br>
&gt; ls: reading directory /afs/: Not a directory<br>
&gt; [gsgatlin@localhost ~]$<br>
&gt;<br>
&gt;<br>
&gt; which is pretty weird. I don&#39;t see anything in the syslog about pr=
oblems with openafs<br>
&gt;<br>
&gt; Feb=C2=A0 1 10:44:24 localhost systemd: Starting OpenAFS Client Servic=
e...<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: libafs: loading out-of-tree mod=
ule taints kernel.<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: libafs: module license &#39;<a =
href=3D"http://www.openafs.org/dl/license10.html" rel=3D"noreferrer" target=
=3D"_blank">http://www.openafs.org/dl/<wbr>license10.html</a>&#39; taints k=
ernel.<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: Disabling lock debugging due to=
 kernel taint<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: libafs: module verification fai=
led: signature and/or required key missing - tainting kernel<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: Key type afs_pag registered<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: enabling dynamically allocated =
vcaches<br>
&gt; Feb=C2=A0 1 10:44:24 localhost kernel: Starting AFS cache scan...Memor=
y cache: Allocating 1600 dcache entries...found 0 non-empty cache files (0%=
).<br>
&gt; Feb=C2=A0 1 10:44:24 localhost afsd: afsd: All AFS daemons started.<br=
>
&gt; Feb=C2=A0 1 10:44:24 localhost afsd: afsd: All AFS daemons started.<br=
>
&gt; Feb=C2=A0 1 10:44:24 localhost systemd: Started OpenAFS Client Service=
.<br>
&gt;<br>
&gt; I am using openafs-1.6.22<br>
&gt;<br>
&gt;<br>
&gt; with<br>
&gt;<br>
&gt; correct-m4-conditionals-in-<wbr>curses.m4.patch<br>
&gt; linux-test-for-vfswrite-<wbr>rather-than-vfsread.patch<br>
&gt; linux-use-kernelread-<wbr>kernelwrite-when-vfs-varian.<wbr>patch<br>
&gt;<br>
&gt; from the arch linux distro in my rpm packages.<br>
&gt;<br>
&gt; Anyone know what<br>
&gt;<br>
&gt; ls: reading directory /afs/: Not a directory<br>
&gt;<br>
&gt; means and is there some way around it?<br>
&gt;<br>
&gt; Also, is 1.6.22.2 coming out soon?<br>
&gt;<br>
&gt; Thanks so much,<br>
&gt;<br>
&gt; On Wed, Jan 31, 2018 at 9:43 AM, Kodiak Firesmith &lt;<a href=3D"mailt=
o:kfiresmith@gmail.com">kfiresmith@gmail.com</a>&gt; wrote:<br>
&gt; <a href=3D"https://photos.app.goo.gl/WgPsSUCLK5ojxIuH3" rel=3D"norefer=
rer" target=3D"_blank">https://photos.app.goo.gl/<wbr>WgPsSUCLK5ojxIuH3</a>=
<br>
&gt;<br>
&gt;<br>
&gt; On Wed, Jan 31, 2018 at 9:41 AM, Kodiak Firesmith &lt;<a href=3D"mailt=
o:kfiresmith@gmail.com">kfiresmith@gmail.com</a>&gt; wrote:<br>
&gt; Folks, re-sending this because the first try never hit the list - perh=
aps mail with attachments are silently dropped or held for manual moderatio=
n?=C2=A0 I&#39;d originally attached an image of the stack trace.=C2=A0 I&#=
39;ll host it and reply to this with a=C2=A0 URL link in case that would al=
so result in a drop or moderation.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; Anyhow:<br>
&gt;<br>
&gt; In testing the new RHEL 7.5 beta, we&#39;ve discovered that hosts usin=
g AFS fail to boot after the upgrade, with Openafs 1.6.22.1 installed.<br>
&gt;<br>
&gt; We are wondering if some of the non-guaranteed kernel ABIs that OpenAF=
S uses might have changed with the latest kernel provided in RHEL 7.<br>
&gt;<br>
&gt; I&#39;ve attached a picture of the trace.<br>
&gt;<br>
&gt; Anyone else kicking the tires on the new RHEL yet?<br>
&gt;<br>
&gt; Thanks!<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
<br>
</div></div><span class=3D"HOEnZb"><font color=3D"#888888">--<br>
Stephan Wiesand<br>
DESY -DV-<br>
Platanenallee 6<br>
15738 Zeuthen, Germany<br>
<br>
<br>
<br>
______________________________<wbr>_________________<br>
OpenAFS-info mailing list<br>
<a href=3D"mailto:OpenAFS-info@openafs.org">OpenAFS-info@openafs.org</a><br=
>
<a href=3D"https://lists.openafs.org/mailman/listinfo/openafs-info" rel=3D"=
noreferrer" target=3D"_blank">https://lists.openafs.org/<wbr>mailman/listin=
fo/openafs-info</a><br>
</font></span></blockquote></div><br></div>

--94eb2c07464a77f61e05642db8e9--