[OpenAFS] Re: [OpenAFS-devel] 1.6 and post-1.6 OpenAFS branch
management and schedule
Jeffrey Hutzelman
jhutz@cmu.edu
Fri, 18 Jun 2010 04:14:32 -0400
--On Thursday, June 17, 2010 01:45:14 PM -0500 "Christopher D. Clausen"
<cclausen@acm.org> wrote:
> I have heard that, but I have never experienced any problems myself in
> many years of running that way. In general the way I see it is that if
> the power goes out, my server stays up for a little longer due to its UPS
> but the network dies immediately so the AFS processes are not doing
> anything when the power finally dies and the server goes down a few
> minutes later. (This is of course assuming no actual server crashes and
> luckily I haven't had any of those.)
We're a bit more agressive over here. If the power goes out, my servers
stay up for a little longer due to the UPS. So does the machine room
network. And the rest of the machine room. And the clients. And _their_
network. See, a few years ago a dean decided that it was unacceptable that
a power outage had killed one of his desktop machines (the hardware, that
is). So, we raised the rates a bit, bought UPS's for every machine in
every office, and after the first couple of years started a rotating
replacement schedule. It's really _very_ nice, but it does mean you can't
count on the clients to die before the servers.
Really, I consider enable-fast-restart to be extremely dangerous.
It should have gone away long ago.
I realize some people believe that speed is more important than not losing
data, but I don't agree, and I don't think it's an appropriate position for
a filesystem to take. Not losing your data is pretty much the defining
difference between filesystems you can lose and filesystems from which you
should run away screaming as fast as you can. I do not want people to run
away screaming from OpenAFS, at any speed.
Bear in mind that enable-fast-restart doesn't mean "start the fileserver
now and worry about checking the damaged volumes later". It means "start
the fileserver now and ignore the damaged volumes until someone complains,
by which time it may be months later and too late to recover the lost data
from backups". It may also mean worse.
Also bear in mind that we're talking about a change after DAFS is good
enough to be on by default, at which point restarts will _already_ be fast,
even if you salvage everything that needs it up front, because not every
volume will have been online at the time of the crash.
> I guess I don't understand the particulars of what could happen, but if
> one is really worried about sending corrupt data, wouldn't the best thing
> to do be check the data as it is being sent and return errors then and
> log that something is wrong, not require an ENTIRE VOLUME to be salvaged,
> leaving all of the files inaccessible for a potentially long period of
> time? I assume that such a thing is not possible to do?
That's right; it's not possible to do. We're not talking about verifying
the (nonexistent) checksums we (don't) keep on data. We're talking about
verifying that the filesystem structure is self-consistent, so we don't
have things like two unrelated directory entries pointing at the same
vnode, or two vnodes pointing at the same underlying file, or whole volumes
whose contents are unreachable because some directory entry is missing.
And, we're talking about discovering cases where data has already been lost
or destroyed, in time to maybe do something about it.
People often complain that the salvager destroys their data, or that fsck
destroys there data. This is almost never true. What these programs do is
discover that your data has already been destroyed, and repair the tear in
the space-time continuum so that it is safe to keep using and changing
what's left.