[OpenAFS] Re: volume offline due to too low uniquifier (and
salvage cannot fix it)
Derrick Brashear
shadow@gmail.com
Tue, 16 Apr 2013 13:34:18 -0400
--001a11c3018a6f02f804da7dc799
Content-Type: text/plain; charset=ISO-8859-1
The problem he's having (apparently I replied without replying all) is he's
wrapping uniquifiers, and
currently the volume package deals poorly, since ~1300 from maxuint plus
2000 plus 1 results in a number "less than the max uniquifier"
We need to decide whether OpenAFS should
1) compact the uniquifier space via the salvager (re-uniquifying all
outstanding vnodes save 1.1, presumably).
or
2) simply allow the uniquifier to wrap, removing the check for "less than
the max", but ensuring we skip 0 and 1. there will be no direct collisions
as no vnode can exist twice
either way, there is a slight chance a vnode,unique tuple which previously
existed may exist again.
On Tue, Apr 16, 2013 at 1:09 PM, Andrew Deason <adeason@sinenomine.net>wrote:
> On Tue, 16 Apr 2013 10:57:13 +0000
> Jakub Moscicki <Jakub.Moscicki@cern.ch> wrote:
>
> > The salvage subsequently says that it fixed it but in reality nothing
> > changes. Here is the salvage output:
>
> "nothing changes" as in, the volinfo output doesn't change at all? Or
> it's just that the problem doesn't go away?
>
> > ======
> > >>>Tue Apr 16 11:34:06 2013: /usr/afs/bin/volinfo -part /vicepac -vol
> 1934053454 -fixheader
>
> I wouldn't run with -fixheader unless you have a reason for it. You
> don't want volinfo rewriting metadata while other procs are running (it
> doesn't check out volumes with fssync or anything). Not that it did
> anything here, but just saying.
>
> [...]
> > type = 0 (read/write), uniquifier = 638, needsCallback = 0, destroyMe = 0
> [...]
> > >>>Tue Apr 16 11:34:07 2013: /usr/afs/bin/salvager /vicepac 1934053454
> -showlog -orphans remove
>
> What is the largest uniquifier in the volume? If you run volinfo with
> -vnode, it will dump the vnode indices. The uniquifier is the third
> number you see, for example:
>
> Small vnodes(files, symbolic links)
> 0 Vnode 2.607.1 cloned: 0, length: 21 linkCount: 1 parent: 1
>
> That's index file offset 0, vnode 2, uniq 607, DV 1.
>
>
> And if I recall correctly, you're running with modifications to the
> small vnode magic, but it looks like you're not running with
> BITMAP_LATER? (Those touch code near where that fileserver error
> occurs.)
>
> --
> Andrew Deason
> adeason@sinenomine.net
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
>
--
Derrick
--001a11c3018a6f02f804da7dc799
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
The problem he's having (apparently I replied without replying all) is =
he's wrapping uniquifiers, and<br>currently the volume package deals po=
orly, since ~1300 from maxuint plus 2000 plus 1 results in a number "l=
ess than the max uniquifier"<br>
<br>We need to decide whether OpenAFS should<br>1) compact the uniquifier s=
pace via the salvager (re-uniquifying all outstanding vnodes save 1.1, pres=
umably). <br>or<br>2) simply allow the uniquifier to wrap, removing the che=
ck for "less than the max", but ensuring we skip 0 and 1. there w=
ill be no direct collisions<br>
as no vnode can exist twice<br><br>either way, there is a slight chance a v=
node,unique tuple which previously existed may exist again.<br><br><div cla=
ss=3D"gmail_quote">On Tue, Apr 16, 2013 at 1:09 PM, Andrew Deason <span dir=
=3D"ltr"><<a href=3D"mailto:adeason@sinenomine.net" target=3D"_blank">ad=
eason@sinenomine.net</a>></span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">On Tue, 16 Apr 2013 10:57:=
13 +0000<br>
Jakub Moscicki <<a href=3D"mailto:Jakub.Moscicki@cern.ch">Jakub.Moscicki=
@cern.ch</a>> wrote:<br>
<br>
> The salvage subsequently says that it fixed it but in reality nothing<=
br>
> changes. Here is the salvage output:<br>
<br>
</div>"nothing changes" as in, the volinfo output doesn't cha=
nge at all? Or<br>
it's just that the problem doesn't go away?<br>
<div class=3D"im"><br>
> =3D=3D=3D=3D=3D=3D<br>
> >>>Tue Apr 16 11:34:06 2013: /usr/afs/bin/volinfo -part /vice=
pac -vol 1934053454 -fixheader<br>
<br>
</div>I wouldn't run with -fixheader unless you have a reason for it. Y=
ou<br>
don't want volinfo rewriting metadata while other procs are running (it=
<br>
doesn't check out volumes with fssync or anything). Not that it did<br>
anything here, but just saying.<br>
<br>
[...]<br>
<div class=3D"im">> type =3D 0 (read/write), uniquifier =3D 638, needsCa=
llback =3D 0, destroyMe =3D 0<br>
</div>[...]<br>
<div class=3D"im">> >>>Tue Apr 16 11:34:07 2013: /usr/afs/bin/s=
alvager /vicepac 1934053454 -showlog -orphans remove<br>
<br>
</div>What is the largest uniquifier in the volume? If you run volinfo with=
<br>
-vnode, it will dump the vnode indices. The uniquifier is the third<br>
number you see, for example:<br>
<br>
Small vnodes(files, symbolic links)<br>
=A0 =A0 =A0 =A0 =A00 Vnode 2.607.1 cloned: 0, length: 21 linkCount: 1 paren=
t: 1<br>
<br>
That's index file offset 0, vnode 2, uniq 607, DV 1.<br>
<br>
<br>
And if I recall correctly, you're running with modifications to the<br>
small vnode magic, but it looks like you're not running with<br>
BITMAP_LATER? (Those touch code near where that fileserver error<br>
occurs.)<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--<br>
Andrew Deason<br>
<a href=3D"mailto:adeason@sinenomine.net">adeason@sinenomine.net</a><br>
</font></span><div class=3D"HOEnZb"><div class=3D"h5"><br>
_______________________________________________<br>
OpenAFS-info mailing list<br>
<a href=3D"mailto:OpenAFS-info@openafs.org">OpenAFS-info@openafs.org</a><br=
>
<a href=3D"https://lists.openafs.org/mailman/listinfo/openafs-info" target=
=3D"_blank">https://lists.openafs.org/mailman/listinfo/openafs-info</a><br>
<br>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>Derrick
--001a11c3018a6f02f804da7dc799--