[OpenAFS] The Importance of Filing Bug Reports

Rodney M. Dyer rmdyer@uncc.edu
Thu, 06 Mar 2008 22:28:52 -0500


At 08:44 PM 3/6/2008, Jeffrey Altman wrote:
>I cannot stress enough the importance of filing bug reports.  The OpenAFS 
>Gatekeepers and the rest of the OpenAFS community that contributes their 
>time and energy to debugging the clients and servers and writing patches 
>cannot fix problems that we do not know exist.

<the rest snipped for brevity>

My comment:

I am a big advocate of AFS as we have been using it since 1991.  Since AFS 
was open sourced there have been big changes made to the Windows client to 
improve robustness and add additional features.  We've definitely been 
through some ups and downs with client problems, but we've stuck with 
it.  The only reason we have been able to that is because of the positive 
response we've received when we needed help.  I feel that since I don't 
directly contribute to the code, I can at least contribute some of our 
environment, and time, for testing and debugging.  Many IT shops that use 
AFS need to allocate at least some time for their full time employees to 
either debug, or directly contribute to OpenAFS.  The right thing to do is 
"pay" for some of the work that others are performing for you, but if you 
can't do that then at least devote some of your time to help debug when 
there are problems.  If not for others, then selfishly.

Over the past few years, whenever I have encountered a bug, I've certainly 
had immediate knee jerk reactions and complained, but I've learned that 
this kind of behavior doesn't get my problem solved any quicker.  I have 
learned to just sit down, take a closer look at the conditions under which 
the bug is manifesting itself, then come up with some kind strategy, 
script, or process that can show a repeatable occurrence of the 
bug.  Non-repeatable problems are the hardest to solve, but you can still 
sometimes automate enough to get some valuable debugging information.  When 
I've come up with logs or scripts that can show the bug in its simplest 
terms, then I submit that information.  That's my part, that's what I do to 
contribute.  And if my part helps to fix the bug, then I've not only helped 
myself, but OpenAFS as well.

Debugging is an art, and it does require a bit of experience to do 
well.  When I find a problem I tend to want to minimize the environmental 
conditions until I've got nothing left but the bug.  I think Sherlock 
Holmes said "How often have I said to you that when you have eliminated the 
impossible, whatever remains, however improbable, must be the truth?"  So 
when there's a bug in the Windows OpenAFS client I just keep removing the 
impossible items that might be the cause.  This process of removal will 
isolate the bug.  Then I create the logs that Jeffrey refers to with the 
"fs minidumps" and the "fs trace" dumps.  I gather all my data together and 
send it to him.  Sometimes he needs me to do even more footwork.  If the 
problem is annoying enough in our environment, I'll clear all my other 
projects off the table and allocate all my time to help with the debugging 
process.  This is the nasty part.  I do have "other things" I need to be 
working on.  However most of those "other things" greatly depend, in a 
large part on the working of our core network service.  If AFS doesn't 
work, then I'm toast anyway.

The Windows environment can be hideously complex to debug and that can take 
the wind out of you sometimes.  However the OpenAFS client is somewhat 
simplistic in the service it provides.  I tend to work at the command line 
because that is where the problems show up for me, mostly in scripting.  As 
Jeffrey mentions, there are tools from Microsoft (SysInternals) that can 
help determine where the problems are.

I can't help but make positive remarks about Jeffrey Altman's commitment to 
OpenAFS.  If you are having trouble with OpenAFS, then please either 
contribute to OpenAFS.org, or get a maintenance contract with his company 
Secure Endpoints.  We did, and we have not regretted it.  We've even 
allocated a special machine at our site that allows Jeffrey to perform on 
site debugging via a remote desktop connection when special debugging 
circumstances require it.  Jeffrey has always come through for us.

It is important that we all work together to get the Windows OpenAFS client 
stable.  That will free up Jeffrey and the other developers to add all 
those nice features that we've requested over the years.  I long to see 
OpenAFS be adopted more in the IT world, but that isn't going to happen if 
it continues to require Jeffrey to guess where the problems are.

I cannot say loud enough...SEND HIM BUG REPORTS!  Then provide time to work 
through your problems with him.  He's really a pleasant guy to work with, 
even if a bit brazen sometimes.  ;-)

Rodney