[AFS3-std] first draft: ubik update proposal
Jeffrey Hutzelman
jhutz@cmu.edu
Tue, 15 Feb 2011 13:48:20 -0500
--On Tuesday, February 15, 2011 01:07:50 AM -0500 Derrick Brashear=20
<shadow@gmail.com> wrote:
>> I'm not clear on how snapshotting interacts with GetFile/SendFile and
>> active operations. =C2=A0I think in practice the mechanism you need is =
one
>> that allows you to "freeze" the target's databases so that active
>> transactions read from the frozen copy, while sendfile prepares a "new"
>> copy; note that there can be no write transactions, since writes happen
>> only on the sync site and these calls are made only by the sync site and
>> never to itself. Having done a snapshot and sent some new files, it must
>> be possible to either commit the new files or discard them; recovery
>> should only do the commit operation if it is still sync site.
>
> the original intent of getfilediff was for some future use, not at this
> time.
>
> sendfilediff is an optimization. just because you're recovering
> doesn't mean the extant quorum can't continue taking writes. so i take
> writes and when sendfile to you finishes, i stop taking writes, send
> *only* a diff, and then commit and resume taking writes, not unlike a
> volume release.
First, properly, "recovering" is something that only the sync site does.=20
Other sites don't "recover"; they simply do what they're told. Still, your =
point is taken -- the sync site can send the bulk of the database while=20
still handling write transactions, and then do an incremental update of=20
some sort at the end.
However, I think you will discover you need an operation which throws away=20
changes since the snapshot, because as soon as you allow not only for=20
multiple files but also for the sync site to keep taking updates during=20
sendfile, there is the possibility that the sync site will stop being sync=20
site, and need to abort any sends it has in progress. Previously this was=20
not an issue, because even though the SendFile took time to run, it was an=20
atomic operation with respect to anything that might modify the database on =
either side.
-- Jeff