[OpenAFS-devel] Re: Breaking callbacks on unlink

Andrew Deason adeason@sinenomine.net
Wed, 25 Jan 2012 17:45:49 -0600


On Wed, 25 Jan 2012 11:05:42 -0800
Russ Allbery <rra@stanford.edu> wrote:

> I think this is the core of the disagreement.  In the absence of
> resources to fix the problem, I think adding a runtime option
> generally just makes the problem worse, because now it behaves
> differently for different sites and the total complexity of the source
> base and bug evaluation has increased, resulting in even less
> resources and making it even less likely that the problem will ever
> actually be fixed.

This is where I think it depends on the option, and why I keep thinking
that talking about this issue at such a general scale may not be useful.
I'm just trying to say that _some_ options are useful, not that we
should be adding them willy-nilly.

Adding any option incurs some amount of additional work, sure. So does
adding platform support, or any feature. I believe all of these can be
viewed as balancing cost and benefit; the cost of the development and
upkeep vs the benefit to the users. You can refuse to add any options
and lessen workload, which will reduce the cost of maintenance, as well
as reducing benefit to users who desire the different functionality. I
think determining which of those trumps the other depends on the
individual option in question.

To think of a couple of examples of rarely-used options: md5inum and
hardmount. I have never had the impression that these significantly
increase maintenance effort, but they are very important options for
more than one site. I'm not sure if I can see how those would ever be
eliminated with one-size-fits-all functionality.

As for this last part (I'm re-quoting):

> resulting in even less resources and making it even less likely that
> the problem will ever actually be fixed.

This assumes that the problem will actually be fixed, or that it will be
in some reasonable amount of time. For some people, a particular problem
with a 'proper' solution may take too long for their needs. Their
choices are then to use something besides AFS (by far the more popular
choice), or to maintain a site-local patch indefinitely (often not an
option at all). The latter option happens, but that is far far more
suboptimal for reasons of supportability.

With a normal discussion, I'm already way past the point where I would
agree to disagree and leave it at that. But, I can't really do that,
since these opinions affect what we're allowed to change in the code.
What I'm trying to avoid is the situation where someone comes to me with
a problem where the only solution satisfying them and existing sites is
new runtime configuration... and I have to say "sorry, that would
require a new option; config options are frozen" regardless of what the
functionality is. We are not quite at that point, but it seems...
close-ish. I'm not sure if you're trying to say this, but what is coming
across is "never add any more options for any reason".


The rest of this email may be taking a turn to the less-productive, with
regards to the issue at hand. I think it should not be a surprise that I
am not a fan of the non-conservative approach of some recent changes.

> For example, look at the idledead problems that have delayed 1.6.1 and
> that have caused serious production outages for some sites, such as
> mine.

You mean the idledead code that existed since before 1.4.11, which afaik
was running fine for you?

> which is the right thing to do; making them a configuration option
> would have just meant leaving a landmine in the code and making it
> even harder to reason about the logic and structure.

With this particular issue, again, there are two irreconcilable desired
behaviors:

 - when accessing a legacy/misbehaving fileserver, yield an error after
   N seconds of no progress
 
 - when accessing a legacy/misbehaving fileserver, hang forever in the
   face of no progress

I believe/assume what is being considered "right" is the latter option.
But to tell me that it is the universal "right" option is arrogant and
inconsiderate of the differences in different site circumstances.

And note that yes, I am aware of the cache consistency problems with the
former approach. And, although sometimes it seems like this idea is
unfathomable to some people in the community, some people _do_ exist
that do not place cache consistency at their highest priority.

-- 
Andrew Deason
adeason@sinenomine.net