[OpenAFS-devel] Re: The ubik transaction ID rollover problem
Andrew Deason
adeason@sinenomine.net
Fri, 3 Sep 2010 13:37:43 -0500
On Mon, 30 Aug 2010 13:12:41 -0500
Andrew Deason <adeason@sinenomine.net> wrote:
> So, based on all of this, I propose to do three things:
>
> 1. Set writeTidCounter = tidCounter on beginning write transactions
> instead of incrementing writeTidCounter by 2.
After much discussion and such on jabber... this is gerrit 2647, if
people want to look.
> 2. Change urecovery_CheckTid to strictly check if the given
> transaction id is _equal_ to the current transaction id, not just
> if it is equal or happened before the current transaction.
While this may be worthwhile to do everywhere except SVOTE_Beacon, I
think this requires more thinking about.
It _might_ be this way to avoid the case where a beacon is sent, and
then a write transaction is created, and the beacon might have an older
(smaller) transaction id, causing the transaction to abort. As far as I
can tell that would only need to be a special case for SVOTE_Beacon
processing, but... it'll take more thought.
> 3. To accomodate ubik sites without the change in "1.", change
> SVOTE_Beacon to only check the epoch of the given transaction id,
> and ignore the tid counter. Otherwise, we will abort valid
> transactions due to other sites giving us a bogus transaction id
> in VOTE_Beacon messages. This relaxes the supposed safety check
> somewhat, but if other sites are sending bogus trans id counters,
> we can't really check them.
After discussing and thinking about this, this may not be a good idea.
The problem is with the sender of the VOTE_Beacon messages, so... we
should just fix the VOTE_Beacon side, and not tighten the VOTE_Beacon
trans id check.
--
Andrew Deason
adeason@sinenomine.net