[OpenAFS-devel] Re: Breaking callbacks on unlink

Russ Allbery rra@stanford.edu
Thu, 26 Jan 2012 12:42:25 -0800


Andrew Deason <adeason@sinenomine.net> writes:
> Russ Allbery <rra@stanford.edu> wrote:

>> That's an unfortunate interpretation of delays in implementing
>> configuration changes, and I'm sorry you got that impression.  A better
>> conclusion to draw is that it takes quite a bit of time to implement
>> file server configuration changes in a large environment with a zero
>> scheduled downtime requirement.

I should also mention on this that we've not been making the problem clear
enough (and I have now gone off and tried to start to remedy that).

> We may be talking about different things. I mean things that afaik are
> not planned to be implemented or even explored that were kind of
> mentioned in passing. (lock quotas per-vnode, or per-volume, or per-host
> but for i/o instead of net; not needing to lock the host for such a long
> period of time on a new rxconn; possibly some others) I don't mean the
> stuff you're implementing; I'm used to up to multi-month delays on that
> kind of thing.

Oh!  Yes, that was something different than what you were talking about.

The basic problem that I have with structural changes to the server and
client code to fix things like this is that I don't know the server code
well enough to have an opinion.  I don't participate in those discussions
not because I'm not interested, but because I defer to the technical
judgement of people like you, Jeff, Simon, Derrick, and jhutz on what the
correct implementation is and what would be effective.

The only general offering I have in evaluating those sorts of solutions is
that they need really solid testing, and at this point the itch in the
back of my brain says that the code is very fragile here.  I'm not sure
what the solution is to that fragility, but in other programming projects
I'd be trying to add tons and tons of tests and then start refactoring to
try to rebuild layering and cleaner internal organization.  But this is
just an intuitive feel.

Part of the problem that I think we all have is that we don't know when
the solution is working and how to tell that it's not broken something
else.  And to some extent this is just a hard problem: it's threaded
software that's part of a distributed system, which is notoriously the
worst possible case for effective debugging and analysis.

> Whoa whoa, hey, no, I'm not saying that. I do not think any of this is
> representative of you being out of step with the community; you are very
> much in step with a large part of it. The viewpoint I'm trying to
> represent I think just comes from a different part of the community; one
> of the roles I/SNA try to serve on the lists is to provide a voice for
> some of the sites that are unable to participate in discussions like
> this themselves.

Okay, yes, that makes sense.  I'm not really trying to be dramatic here;
I'm just frustrated.  And not at you, particularly, but at the general
community situation, the list of things that I know AFS needs to do in
order to stay strategic, the difficulty in finding a funding model for
that work that is actually effective and achieves its goals, and the
amount of time that's taken away from strategic development to try to get
OpenAFS 1.6 stable.  And Elders issues that haven't resolved for 18
months, and... you know.  It's a long list.

> So if such a vehement opposition to runtime options may lessen a little
> if stability is improved... that's great and something to work towards,
> and something I can completely understand. Before the emails today it
> didn't really occur to me that that might have been contributing to such
> an opinion.

Oh, yes, absolutely.  I mean, it's not the only issue; the issues of how
to change a protocol in a backward-compatible manner are also significant,
as well as what we want out of the standardization process, and code
complexity issues, and so forth.  But, in general, if we're managing the
complexity we already have, I'm more comfortable with adding more.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>