[OpenAFS] Re: [OpenAFS-devel] 1.6 and post-1.6 OpenAFS branch management and schedule

Russ Allbery rra@stanford.edu
Thu, 17 Jun 2010 11:59:29 -0700

"Christopher D. Clausen" <cclausen@acm.org> writes:
> Russ Allbery <rra@stanford.edu> wrote:

>> Chris, to check, are you currently using --enable-fast-restart or
>> --enable-bitmap-later?

> Yes, both of them.


> I have heard that, but I have never experienced any problems myself in
> many years of running that way.

Yes, that's how Russian roulette generally works.  :)

> Its fine to not have it enabled by default, but I can't see why one
> would remove the functionality from the source tree.

Well, I want to remove the functionality from the source tree because the
code is difficult to maintain in conjunction with other features that
aren't dangerous and which we can comfortably recommend people use.  I
don't see either of these options being a strategic direction that OpenAFS
should take going forward, so insofar as they interfere with maintaining
or adding new features that *are* that strategic direction, they make it
harder to maintain the software.

> I guess I don't understand the particulars of what could happen, but if
> one is really worried about sending corrupt data, wouldn't the best
> thing to do be check the data as it is being sent and return errors then
> and log that something is wrong,

Yes, that's what demand-attach does by salvaging the volume.

> not require an ENTIRE VOLUME to be salvaged, leaving all of the files
> inaccessible for a potentially long period of time?

Unless you salvage the volume, you have absolutely no idea whether the
data is corrupt or not.  Checking the consistency of the data is exactly
what salvager does.  You can't do that on the fly in the file server; the
file server would then be just as slow as the salvager.  :)

> I mean I occationally see NTFS errors in the event log on Windows
> servers. Windows doesn't take the disk offline and run a chkdsk for me
> to prevent potential errors, it allows me to try and access other data
> and if it works there are no problems and denies access to specific
> files or directories if there is corruption.

I'm quite sure that, after an unclean crash, your Windows server doesn't
remount the file system without doing a consistency check.  No operating
system treats its file systems that way.

> Will DAFS be enabled by default in 1.6?

No, or at least I don't believe so.

> Ok, so http://docs.openafs.org/Reference/8/bos_create.html is the only
> documentation on openafs.org on demand attach?

> Ah, I see a http://docs.openafs.org/Reference/8/salvageserver.html as well.

Right, you were asking about migration, which is in bos create.  There's
other documentation, although as Jeff mentioned we really need Admin Guide
documentation as well.

> Perhaps a generic dafs man page is in order for us non-developer types
> to be up to speed on what DAFS is, what the benefits are, and how to use
> it correctly?

I don't believe a man page is an appropriate format to maintain the
higher-level documentation.  It should be in the Admin Guide.

Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>