[OpenAFS-devel] linux45: smoke test failed

Joe Gorse jhgorse@gmail.com
Fri, 17 Jun 2016 17:39:03 -0400


--001a114e45f4222851053580300b
Content-Type: text/plain; charset=UTF-8

Stephan

FWIW, I am able to reproduce a "cwd" message for "git log" command on
Fedora 23, 4.5.6-200.fc23.x86_64. "git log" reads:

fatal: Unable to read current working directory: No such file or directory

Though it should read:

fatal: Not a git repository (or any of the parent directories): .git


However, I am not having any trouble with the git checkout. It seems to
consistently work on Fedora 23. Even the "git checkout
openafs-stable-1_6_18". Perhaps try on 4.5.6 on Fedora?

Though I have seen some more of this issue on Debian 8 with kernel 4.6.0.
Three of three tests failed to checkout the openafs tree on this system. I
will test some other kernels on this system later and note anything
interesting.

Cheers,
Joe

On Fri, Jun 17, 2016 at 11:30 AM, Stephan Wiesand <stephan.wiesand@desy.de>
wrote:

>
> On Jun 17, 2016, at 04:45 , Benjamin Kaduk wrote:
>
> > On Thu, 16 Jun 2016, Stephan Wiesand wrote:
> >
> >> I smoke tested what was planned to be OpenAFS 1.6.18.1, as discussed in
> yesterday's release team meeting, on a Fedora 23 x86_64 VM with kernel
> 4.5.6-200 today. The result was disappointing:
> >>
> >> git clone git://gerrit.openafs.org/openafs.git
> >
> > Is the pwd the root of a volume?
>
> No, everything happens at least one level below.
>
> >> cd openafs
> >> git log
> >> # scrolled through a few dozen changes, took a couple of seconds
> >> git checkout openafs-stable-1_6_18
> >>
> >> At this point I got the following error:
> >>
> >> fatal: Unable to read current working directory: No such file or
> directory
> >>
> >> A "cd; cd -" cures this for a while, and there's no apparent data
> corruption. I'm still worried. The problem isn't 100% reproducible, but it
> doesn't take too may tries checking out random tags or branches.
> >>
> >> This was plain 1.6.18 + gerrit 12300 12301 12302 12274.
> >>
> >> Cache is on ext4, no separate partition, default size as set by our RPM
> (I think 100MB, but I don't have access to the VM right now to check).
> >>
> >> The small cache size may contribute to the problem. But I found no
> errors logged anywhere, and this shouldn't happen no matter how small the
> cache is.
> >
> > Please check if the cmdebug output is empty (I expect it is, but it is
> > good to check).
>
> It is empty.
>
> >> NB we have a user report of exactly this problem happening frequently
> while just editing files in a local git repo in AFS space. The data is a
> bit sketchy, but it's probably Ubuntu 14.04 with its current default kernel
> and the openafs packages from Anders' ppa. I'll try to get us more data.
> >>
> >>
> >> Any thoughts? For the time being I'm considering this a showstopper for
> >> 1.6.18.1, and it looks like we're not quite there yet regarding Linux
> >> 4.5, let alone 4.6 or the 4.7 due in a few weeks :-(
> >
> > Can you run the same test on a 4.4 kernel for comparison?
>
> I tried under the last F22 kernel, 4.4.6-200.fc22. And ok, it's not 4.5
> specific, though it seems to happen more frequently with 4.5.2 than with
> 4.4.6.
>
> By chance I found a pretty reliable reproducer:
>
>         cd /vol/ume/root
>         mkdir g; cd g
>         git clone git://gerrit.openafs.org/openafs.git; sleep 180; git log
>
> Note indeed no "cd openafs". Of course this should complain about the cwd
> not being a git repo. But most of the time it will complain about the cwd
> issue instead.
>
> I'm planning to verify that plain 1.6.18 behaves the same on 4.4.6, and if
> it does I'll proceed with the 1.6.18.1 release.
>
> I couldn't reproduce this with any EL clients, but those have larger
> caches (it's indeed 100 MB on that Fedora VM), so there's more to test.
> Help welcome...
>
>
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>



-- 
Joe Gorse

C: 440-552-0730
LI: Joe Gorse <http://www.linkedin.com/pub/joe-gorse/7/12/397>

--001a114e45f4222851053580300b
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Stephan<div><br></div><div>FWIW, I am able to reproduce a =
&quot;cwd&quot; message for &quot;git log&quot; command on Fedora 23, 4.5.6=
-200.fc23.x86_64. &quot;git log&quot; reads:</div>







<div><blockquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px">=
<div>fatal: Unable to read current working directory: No such file or direc=
tory</div></blockquote>Though it should read:<div><blockquote style=3D"marg=
in:0px 0px 0px 40px;border:none;padding:0px"><div>fatal: Not a git reposito=
ry (or any of the parent directories): .git<br></div></blockquote><br></div=
></div><div>However, I am not having any trouble with the git checkout. It =
seems to consistently work on Fedora 23. Even the &quot;<span style=3D"font=
-size:12.8px">git checkout openafs-stable-1_6_18&quot;. Perhaps try on 4.5.=
6 on Fedora?</span></div><div><br></div><div>Though I have seen some more o=
f this issue on Debian 8 with kernel 4.6.0. Three of three tests failed to =
checkout the openafs tree on this system. I will test some other kernels on=
 this system later and note anything interesting.</div>







<div><br></div><div>Cheers,</div><div>Joe</div></div><div class=3D"gmail_ex=
tra"><br><div class=3D"gmail_quote">On Fri, Jun 17, 2016 at 11:30 AM, Steph=
an Wiesand <span dir=3D"ltr">&lt;<a href=3D"mailto:stephan.wiesand@desy.de"=
 target=3D"_blank">stephan.wiesand@desy.de</a>&gt;</span> wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
solid;padding-left:1ex"><span class=3D""><br>
On Jun 17, 2016, at 04:45 , Benjamin Kaduk wrote:<br>
<br>
&gt; On Thu, 16 Jun 2016, Stephan Wiesand wrote:<br>
&gt;<br>
&gt;&gt; I smoke tested what was planned to be OpenAFS 1.6.18.1, as discuss=
ed in yesterday&#39;s release team meeting, on a Fedora 23 x86_64 VM with k=
ernel 4.5.6-200 today. The result was disappointing:<br>
&gt;&gt;<br>
&gt;&gt; git clone git://<a href=3D"http://gerrit.openafs.org/openafs.git" =
rel=3D"noreferrer" target=3D"_blank">gerrit.openafs.org/openafs.git</a><br>
&gt;<br>
&gt; Is the pwd the root of a volume?<br>
<br>
</span>No, everything happens at least one level below.<br>
<span class=3D""><br>
&gt;&gt; cd openafs<br>
&gt;&gt; git log<br>
&gt;&gt; # scrolled through a few dozen changes, took a couple of seconds<b=
r>
&gt;&gt; git checkout openafs-stable-1_6_18<br>
&gt;&gt;<br>
&gt;&gt; At this point I got the following error:<br>
&gt;&gt;<br>
&gt;&gt; fatal: Unable to read current working directory: No such file or d=
irectory<br>
&gt;&gt;<br>
&gt;&gt; A &quot;cd; cd -&quot; cures this for a while, and there&#39;s no =
apparent data corruption. I&#39;m still worried. The problem isn&#39;t 100%=
 reproducible, but it doesn&#39;t take too may tries checking out random ta=
gs or branches.<br>
&gt;&gt;<br>
&gt;&gt; This was plain 1.6.18 + gerrit 12300 12301 12302 12274.<br>
&gt;&gt;<br>
&gt;&gt; Cache is on ext4, no separate partition, default size as set by ou=
r RPM (I think 100MB, but I don&#39;t have access to the VM right now to ch=
eck).<br>
&gt;&gt;<br>
&gt;&gt; The small cache size may contribute to the problem. But I found no=
 errors logged anywhere, and this shouldn&#39;t happen no matter how small =
the cache is.<br>
&gt;<br>
&gt; Please check if the cmdebug output is empty (I expect it is, but it is=
<br>
&gt; good to check).<br>
<br>
</span>It is empty.<br>
<span class=3D""><br>
&gt;&gt; NB we have a user report of exactly this problem happening frequen=
tly while just editing files in a local git repo in AFS space. The data is =
a bit sketchy, but it&#39;s probably Ubuntu 14.04 with its current default =
kernel and the openafs packages from Anders&#39; ppa. I&#39;ll try to get u=
s more data.<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; Any thoughts? For the time being I&#39;m considering this a showst=
opper for<br>
&gt;&gt; 1.6.18.1, and it looks like we&#39;re not quite there yet regardin=
g Linux<br>
&gt;&gt; 4.5, let alone 4.6 or the 4.7 due in a few weeks :-(<br>
&gt;<br>
&gt; Can you run the same test on a 4.4 kernel for comparison?<br>
<br>
</span>I tried under the last F22 kernel, 4.4.6-200.fc22. And ok, it&#39;s =
not 4.5 specific, though it seems to happen more frequently with 4.5.2 than=
 with 4.4.6.<br>
<br>
By chance I found a pretty reliable reproducer:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 cd /vol/ume/root<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 mkdir g; cd g<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 git clone git://<a href=3D"http://gerrit.openaf=
s.org/openafs.git" rel=3D"noreferrer" target=3D"_blank">gerrit.openafs.org/=
openafs.git</a>; sleep 180; git log<br>
<br>
Note indeed no &quot;cd openafs&quot;. Of course this should complain about=
 the cwd not being a git repo. But most of the time it will complain about =
the cwd issue instead.<br>
<br>
I&#39;m planning to verify that plain 1.6.18 behaves the same on 4.4.6, and=
 if it does I&#39;ll proceed with the 1.6.18.1 release.<br>
<br>
I couldn&#39;t reproduce this with any EL clients, but those have larger ca=
ches (it&#39;s indeed 100 MB on that Fedora VM), so there&#39;s more to tes=
t. Help welcome...<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
<br>
_______________________________________________<br>
OpenAFS-devel mailing list<br>
<a href=3D"mailto:OpenAFS-devel@openafs.org">OpenAFS-devel@openafs.org</a><=
br>
<a href=3D"https://lists.openafs.org/mailman/listinfo/openafs-devel" rel=3D=
"noreferrer" target=3D"_blank">https://lists.openafs.org/mailman/listinfo/o=
penafs-devel</a><br>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div class=3D"gmail_signature" data-smartmail=3D"gmail_signature"><div dir=
=3D"ltr"><div>Joe Gorse<div><font face=3D"Calibri" size=3D"1"><br></font></=
div><div><font face=3D"Calibri" size=3D"1"><div style=3D"font-family:arial;=
font-size:small"><span style=3D"font-family:arial;background-color:rgb(255,=
255,255)"><font size=3D"1">C:=C2=A0</font></span><span style=3D"background-=
color:rgb(255,255,255);font-size:x-small"><a value=3D"+14402344531"><font c=
olor=3D"#0000cc" face=3D"Helvetica">440-552-0730</font></a></span></div><di=
v style=3D"font-family:arial;font-size:small"><span style=3D"font-family:ar=
ial;background-color:rgb(255,255,255)"><font size=3D"1">LI: <a href=3D"http=
://www.linkedin.com/pub/joe-gorse/7/12/397" target=3D"_blank">Joe Gorse</a>=
</font></span></div></font></div></div></div></div>
</div>

--001a114e45f4222851053580300b--