[OpenAFS] Re: volume offline due to too low uniquifier (and salvage cannot fix it)

Derrick Brashear shadow@gmail.com
Tue, 16 Apr 2013 14:27:12 -0400


--089e013cb9d0a0e6ee04da7e84c4
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Apr 16, 2013 at 2:07 PM, Andrew Deason <adeason@sinenomine.net>wrote:

> On Tue, 16 Apr 2013 13:34:18 -0400
> Derrick Brashear <shadow@gmail.com> wrote:
>
> > The problem he's having (apparently I replied without replying all) is
> > he's wrapping uniquifiers, and currently the volume package deals
> > poorly, since ~1300 from maxuint plus 2000 plus 1 results in a number
> > "less than the max uniquifier"
>
> Okay; yeah, makes sense.
>
> > We need to decide whether OpenAFS should
> > 1) compact the uniquifier space via the salvager (re-uniquifying all
> > outstanding vnodes save 1.1, presumably).
> > or
> > 2) simply allow the uniquifier to wrap, removing the check for "less
> > than the max", but ensuring we skip 0 and 1. there will be no direct
> > collisions as no vnode can exist twice
> >
> > either way, there is a slight chance a vnode,unique tuple which
> > previously existed may exist again.
>
> Yes, but that is inevitable unless we keep trackof uniqs per-vnode or
> something. If we do option (1), I feel like that makes the possible
> collisions more infrequent in a way, since the event triggering the
> collisions is a salvage, which has 'undefined' new contents for caching
> purposes anyway. In option (2) you can have a collision by just removing
> a file and creating one. Maybe those aren't _so_ different, but that's
> my impression.
>

It's pretty easy to avoid the condition you mention in option 2, but it
does mean
additional "consumption" of the uniq space: on a remove, make sure the next
uniq we'd allocate is not close to our current value, potentially by
using a large increment if we are close. But I'm not sure it's worth that.

>
> I feel like the fileserver could also maybe not increment the uniq
> counter so much, if we issue a lot of create's/mkdir's with no other
> interleaving operations. That is, if we create 3 files in a row, it's
> fine if they were given fids 1.2.9, 1.4.9, and 1.6.9, right? We would
> guarantee that we wouldn't collide on the whole fid (at least, no more
> so than now), just on the uniq, which is okay, right? That might help
> avoid this in some scenarios.
>

same uniq on different vnode should not be a problem.


>
> And for kuba's sake, I guess the immediate workaround to get the volume
> online could be to just remove that check for the uniq. I would use that
> to just get the data online long enough to copy the data to another
> volume, effectively re-uniq-ifying them. I think I'd be uneasy with just
> not having the check in general, but I'd need to think about it...
>
>
i think there should be "some" check but it's not apparent to me that the
current one
is "right" so much as "simple"


-- 
Derrick

--089e013cb9d0a0e6ee04da7e84c4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">On Tue, Apr 16, 2013 at 2:07 PM, Andrew =
Deason <span dir=3D"ltr">&lt;<a href=3D"mailto:adeason@sinenomine.net" targ=
et=3D"_blank">adeason@sinenomine.net</a>&gt;</span> wrote:<br><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;=
padding-left:1ex">
<div class=3D"im">On Tue, 16 Apr 2013 13:34:18 -0400<br>
Derrick Brashear &lt;<a href=3D"mailto:shadow@gmail.com">shadow@gmail.com</=
a>&gt; wrote:<br>
<br>
&gt; The problem he&#39;s having (apparently I replied without replying all=
) is<br>
&gt; he&#39;s wrapping uniquifiers, and currently the volume package deals<=
br>
&gt; poorly, since ~1300 from maxuint plus 2000 plus 1 results in a number<=
br>
&gt; &quot;less than the max uniquifier&quot;<br>
<br>
</div>Okay; yeah, makes sense.<br>
<div class=3D"im"><br>
&gt; We need to decide whether OpenAFS should<br>
&gt; 1) compact the uniquifier space via the salvager (re-uniquifying all<b=
r>
&gt; outstanding vnodes save 1.1, presumably).<br>
&gt; or<br>
&gt; 2) simply allow the uniquifier to wrap, removing the check for &quot;l=
ess<br>
&gt; than the max&quot;, but ensuring we skip 0 and 1. there will be no dir=
ect<br>
&gt; collisions as no vnode can exist twice<br>
&gt;<br>
&gt; either way, there is a slight chance a vnode,unique tuple which<br>
&gt; previously existed may exist again.<br>
<br>
</div>Yes, but that is inevitable unless we keep trackof uniqs per-vnode or=
<br>
something. If we do option (1), I feel like that makes the possible<br>
collisions more infrequent in a way, since the event triggering the<br>
collisions is a salvage, which has &#39;undefined&#39; new contents for cac=
hing<br>
purposes anyway. In option (2) you can have a collision by just removing<br=
>
a file and creating one. Maybe those aren&#39;t _so_ different, but that&#3=
9;s<br>
my impression.<br></blockquote><div><br>It&#39;s pretty easy to avoid the c=
ondition you mention in option 2, but it does mean<br>additional &quot;cons=
umption&quot; of the uniq space: on a remove, make sure the next uniq we&#3=
9;d allocate is not close to our current value, potentially by<br>
using a large increment if we are close. But I&#39;m not sure it&#39;s wort=
h that. <br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I feel like the fileserver could also maybe not increment the uniq<br>
counter so much, if we issue a lot of create&#39;s/mkdir&#39;s with no othe=
r<br>
interleaving operations. That is, if we create 3 files in a row, it&#39;s<b=
r>
fine if they were given fids 1.2.9, 1.4.9, and 1.6.9, right? We would<br>
guarantee that we wouldn&#39;t collide on the whole fid (at least, no more<=
br>
so than now), just on the uniq, which is okay, right? That might help<br>
avoid this in some scenarios.<br></blockquote><div><br>same uniq on differe=
nt vnode should not be a problem.<br>=A0<br></div><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">

<br>
And for kuba&#39;s sake, I guess the immediate workaround to get the volume=
<br>
online could be to just remove that check for the uniq. I would use that<br=
>
to just get the data online long enough to copy the data to another<br>
volume, effectively re-uniq-ifying them. I think I&#39;d be uneasy with jus=
t<br>
not having the check in general, but I&#39;d need to think about it...<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br></div></div></blockquote><div><=
br>i think there should be &quot;some&quot; check but it&#39;s not apparent=
 to me that the current one<br>is &quot;right&quot; so much as &quot;simple=
&quot;<br>
</div></div><br clear=3D"all"><br>-- <br>Derrick

--089e013cb9d0a0e6ee04da7e84c4--