[OpenAFS-devel] Re: 1.6 and post-1.6 OpenAFS branch management and schedule

Andrew Deason adeason@sinenomine.net
Thu, 17 Jun 2010 11:11:14 -0500


On Wed, 16 Jun 2010 18:38:37 -0400
Tom Keiser <tkeiser@sinenomine.net> wrote:

> > Yes, but I think the initial hit of the VGC scan[0] currently makes
> > *any* salvage immediately after a crash potentially a large problem,
> 
> No.  If you have a damaged volume in *any* environment, be that 1.4,
> 1.4+fast-restart, or DAFS, you will need to perform a VGC scan
> [whether it is via _VVGC_scan_partition() or GetVolumeSummary() is
> immaterial].  The key difference is whether or not this cost is
> amortized up-front, or expended on every salvage.
[...]
> No matter what scenario you look at, the VGC scan costs are there; the
> differentiators are whether scan costs are constant or linear with
> respect to the number of salvages, and whether anything can be served
> while the salvages/VGC scans are occuring.  All of the wins are on the
> side of DAFS.

Yes, I am not saying that the VGC or DAFS gives you anything worse here,
or incurs an additional penalty. My point is just that even with DAFS,
the cost of an unclean shutdown is still potentially rather long. Even
though salvaging can take seconds, scanning the VG hierarchy can take
minutes. Some places simply can *not* wait that long for anything
(useful) to be served; it is unacceptable, full stop. Data consistency
and reliability do not trump speed 100% of the time.

> That's fundamentally unsafe--you could end up serving garbage, which
> is why I have been, and remain, adamantly opposed to FAST_RESTART.
> VCheckInUse should *always* return 1 for programType==fileServer.
> OTOH, I'd be fine with a fileserver command line switch that makes
> VCanScheduleSalvage() always return 0 and/or VRequestSalvage_r()
> become a no-op...

I'm not really seeing how that would help anyone...?

-- 
Andrew Deason
adeason@sinenomine.net