[OpenAFS] Re: Questions about 'vldb_check -fix'

Mon, 28 Oct 2013 11:25:37 -0500

On Mon, 28 Oct 2013 10:32:44 -0400
Kendrick Hernandez <kendrick.hernandez@umbc.edu> wrote:

> In order to minimize downtime, I'd like to run vldb_check on an
> offline copy of the vldb, but I'm wondering if the vlserver needs to
> be completely shutdown prior to making the copy, or if read-only mode
> would be good enough?

This "read-only mode" (basically, "not having quorum") is about as safe
as just shutting down the server processes, yes. The sync site will not
immediately go into "read-only mode" until about a minute after the
other two servers are sutdown, but I don't think you need to actually
wait for that. If the other two servers are down, the sync site won't be
able to commit any data, so the database should be unchanging so it
should be "safe" for copying.

However, neither this, nor shutting down the servers, is completely
guaranteed to be safe. If the sync site is in the middle of committing
data to the database when you do this, you will get a vldb.DB0 file that
has some data partially written to it. There are some ways to deal with
that, but I'm not sure if there's a completely robust solution for what
you're talking about:

If you SIGSTOP the process, copy both vldb.DB0 and vldb.DBSYS1 out, and
then SIGCONT the process, that will give you enough information to
reconstruct a valid database. The DBSYS1 file is a journal log, though,
and I don't think we have any tooling to manually just replay the log
into the DB0 file. (Maybe we should; that would solve this pretty
easily, I think.)

Without replying the log, though, you can just look at if the DBSYS1 log
is "empty". If it is, the corresponding DB0 file is definitely fine, and
you can just use that. If it's not, you can just throw away the recorded
files and try again. But if the other two sites are shutdown, we may be
in the middle of a write transaction that will not complete until we
timeout, so it can be difficult to detect a false positive. The DBSYS1
file is "empty" I think if it's 64 bytes full of 0s; but there may be
other ways it can appear "empty".

Another way of getting a robust vldb.DB0 copy is using the propsed
ubik_cp tool <http://gerrit.openafs.org/#change,9700>. I don't think
that will work, though, if two of the sites are shutdown and we're in
the middle of a write transaction. I could be wrong about that, though,
I haven't tried.

Also, all of this possible corruption is probably pretty rare. If you
don't care so much about 100% guarantees and such, you can probably just
copy the database twice, waiting a few seconds between each copy, and if
they are the same, it's really likely that they're fine. That is
especially true if two of the sites are shutdown, since we should have
at most 1 in-progress write.

> Then I'd make a copy of vldb.db0, run 'vldb_check -fix' on the copy,
> hand-propagate that out to the remaining sites, and then bring down
> the vlserver on the lowest ip site just before moving the fixed copy
> into place. I could then bring up the vlserver on the lowest ip site,
> and the remaining sites.

If 'vldb_check -fix' bumps the database version number, you would be
able to just install the new db on one site and let them synchronize
themselves. I don't think it does that (yet), so yeah, you should
hand-propagate the files. That's faster, anyway.

-- 
Andrew Deason
adeason@sinenomine.net