[OpenAFS] getcwd() error for RHEL 7.4 kernel

Matt Vander Werf mvanderw@nd.edu
Thu, 19 Oct 2017 09:18:56 -0400


--089e0823e0e43c0f53055be6350e
Content-Type: text/plain; charset="UTF-8"

Hi Ben,

What do you mean by an openafs config.log? Where would this be at? Would it
be on the client or the AFS file server? Or is there something that needs
to be done to generate this log file?

Thanks.

--
Matt Vander Werf
HPC System Administrator
University of Notre Dame
Center for Research Computing - Union Station
506 W. South Street
South Bend, IN 46601

On Wed, Oct 18, 2017 at 7:21 PM, Benjamin Kaduk <kaduk@mit.edu> wrote:

> On Tue, Oct 17, 2017 at 11:55:27AM -0400, Jacob Bonek wrote:
> > Hello,
> >
> > We're having some strange issues with OpenAFS lately.
> >
> > It started after installing the base RHEL 7.4 kernel, 3.10.0-693
> .el7.x86_64
> > back in August, with the latest version of OpenAFS client at the time,
> > 1.6.21. We've tried using the now latest version, 1.6.21.1, and still
> have
> > the same issues. This happens with all the subsequent RHEL 7.4 kernels as
> > well, including the latest kernel, 3.10.0-693.2.2.el7.x86_64.
> >
> [...]
> >
> > This is a major issue that has caused us to have to stay at the latest
> > pre-RHEL 7.4 kernel for a long time now while this issue has existed.
> This
> > may be related to previous issues with getcwd() but something in the RHEL
> > 7.4 kernel seems to have made it much worse. Simply rebooting a system
> does
> > not fix it, nor does clearing the AFS cache.
> >
> > Has anyone else experienced this issue with RHEL 7.4? Is there anything
> > that we can do to narrow down what is causing this?
>
> I think we've seen another report or two, but it's always been hard to
> reproduce.  That said, with the specifics you've offered about the kernel
> version that introduced the issue, we've got a couple folks trying to
> reproduce in a controlled environment.
>
> In the meantime, could you post an (openafs) config.log from one of the
> affected systems?  It's pretty long, so maybe as an attachment for
> mail to openafs-bugs@openafs.org is best.
>
> Thanks,
>
> Ben
> OpenAFS Guardian
>

--089e0823e0e43c0f53055be6350e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div>Hi Ben,<br><br></div>What do you mean by an open=
afs config.log? Where would this be at? Would it be on the client or the AF=
S file server? Or is there something that needs to be done to generate this=
 log file?<br><br></div>Thanks.<br><div class=3D"gmail_extra"><br clear=3D"=
all"><div><div class=3D"gmail_signature" data-smartmail=3D"gmail_signature"=
><div dir=3D"ltr"><div>--<br></div><div>Matt Vander Werf<br>HPC System Admi=
nistrator<br>University of Notre Dame<br>Center for Research Computing - Un=
ion Station<br>506 W. South Street<br>South Bend, IN 46601<br></div></div><=
/div></div>
<br><div class=3D"gmail_quote">On Wed, Oct 18, 2017 at 7:21 PM, Benjamin Ka=
duk <span dir=3D"ltr">&lt;<a href=3D"mailto:kaduk@mit.edu" target=3D"_blank=
">kaduk@mit.edu</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><sp=
an class=3D"">On Tue, Oct 17, 2017 at 11:55:27AM -0400, Jacob Bonek wrote:<=
br>
&gt; Hello,<br>
&gt;<br>
&gt; We&#39;re having some strange issues with OpenAFS lately.<br>
&gt;<br>
&gt; It started after installing the base RHEL 7.4 kernel, <a href=3D"tel:3=
.10.0-693" value=3D"+13100693">3.10.0-693</a>.el7.x86_64<br>
&gt; back in August, with the latest version of OpenAFS client at the time,=
<br>
&gt; 1.6.21. We&#39;ve tried using the now latest version, 1.6.21.1, and st=
ill have<br>
&gt; the same issues. This happens with all the subsequent RHEL 7.4 kernels=
 as<br>
&gt; well, including the latest kernel, 3.10.0-693.2.2.el7.x86_64.<br>
&gt;<br>
</span>[...]<br>
<span class=3D"">&gt;<br>
&gt; This is a major issue that has caused us to have to stay at the latest=
<br>
&gt; pre-RHEL 7.4 kernel for a long time now while this issue has existed. =
This<br>
&gt; may be related to previous issues with getcwd() but something in the R=
HEL<br>
&gt; 7.4 kernel seems to have made it much worse. Simply rebooting a system=
 does<br>
&gt; not fix it, nor does clearing the AFS cache.<br>
&gt;<br>
&gt; Has anyone else experienced this issue with RHEL 7.4? Is there anythin=
g<br>
&gt; that we can do to narrow down what is causing this?<br>
<br>
</span>I think we&#39;ve seen another report or two, but it&#39;s always be=
en hard to<br>
reproduce.=C2=A0 That said, with the specifics you&#39;ve offered about the=
 kernel<br>
version that introduced the issue, we&#39;ve got a couple folks trying to<b=
r>
reproduce in a controlled environment.<br>
<br>
In the meantime, could you post an (openafs) config.log from one of the<br>
affected systems?=C2=A0 It&#39;s pretty long, so maybe as an attachment for=
<br>
mail to <a href=3D"mailto:openafs-bugs@openafs.org">openafs-bugs@openafs.or=
g</a> is best.<br>
<br>
Thanks,<br>
<br>
Ben<br>
OpenAFS Guardian<br>
</blockquote></div><br></div></div>

--089e0823e0e43c0f53055be6350e--