[OpenAFS-devel] Re: The ubik transaction ID rollover problem

Andrew Deason adeason@sinenomine.net
Fri, 3 Sep 2010 13:37:43 -0500


On Mon, 30 Aug 2010 13:12:41 -0500
Andrew Deason <adeason@sinenomine.net> wrote:

> So, based on all of this, I propose to do three things:
> 
>  1. Set writeTidCounter = tidCounter on beginning write transactions
>     instead of incrementing writeTidCounter by 2.

After much discussion and such on jabber... this is gerrit 2647, if
people want to look.

>  2. Change urecovery_CheckTid to strictly check if the given
>     transaction id is _equal_ to the current transaction id, not just
>     if it is equal or happened before the current transaction.

While this may be worthwhile to do everywhere except SVOTE_Beacon, I
think this requires more thinking about.

It _might_ be this way to avoid the case where a beacon is sent, and
then a write transaction is created, and the beacon might have an older
(smaller) transaction id, causing the transaction to abort. As far as I
can tell that would only need to be a special case for SVOTE_Beacon
processing, but... it'll take more thought.

>  3. To accomodate ubik sites without the change in "1.", change
>     SVOTE_Beacon to only check the epoch of the given transaction id,
>     and ignore the tid counter. Otherwise, we will abort valid
>     transactions due to other sites giving us a bogus transaction id
>     in VOTE_Beacon messages. This relaxes the supposed safety check
>     somewhat, but if other sites are sending bogus trans id counters,
>     we can't really check them.

After discussing and thinking about this, this may not be a good idea.
The problem is with the sender of the VOTE_Beacon messages, so... we
should just fix the VOTE_Beacon side, and not tighten the VOTE_Beacon
trans id check.

-- 
Andrew Deason
adeason@sinenomine.net