[OpenAFS] OpenAFS for Windows 1.5.72, Windows 7, VPN session killing

Jeffrey Altman jaltman@secure-endpoints.com
Sat, 13 Mar 2010 12:36:13 -0500

This is a cryptographically signed message in MIME format.

Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 3/13/2010 11:43 AM, Jeff Blaine wrote:
>> So why is the problem the fault of OpenAFS or KFW?
> Jeffrey, it was just a *thought* that maybe KfW or
> OpenAFS under Windows 7 was doing something weird/wrong.

> Is that really such a stretch, as someone who doesn't
> know the source for these products inside and out, to
> ping the list *to see if maybe* I've hit something that
> nobody else has hit yet on such a new platform, or
> maybe that someone *has* hit and has a solution for?

There is nothing wrong with posting to the list.
The distinction is that you didn't ask if anyone else was
experiencing a problem when running the combination and
you didn't post indicating that the problem might be with
Cisco or Microsoft's code.  The tone of your e-mails is
that the problem must be with the KFW, NetIdMgr, or OpenAFS

> I ran into a new problem with the tools.  I queried
> the list and provided some info.
> Thanks for the detailed reply, but you seem to read an
> accusatory tone of some sort into everything I type -
> like I've offended you by posting to the community with
> a problem I've hit, and haven't been able to figure out
> yet.
> I can't really grasp why a report/question about
> Cisco VPN + Windows 7 and the tools these lists
> revolve around is so offensive/annoying to you.

Its the assumption that something must be wrong
with KFW, OpenAFS or NetIdMgr and not with the Cisco software.

The Cisco VPN release notes show that they have an extremely
long list of open and unresolved compatibility issues with
other products.  Including several of Microsofts.


> I'm sorry I don't know immediately and exactly where to
> look for the cause of problems like you do.  I wish I
> knew everything about everything, but I don't, and you
> don't.

I don't expect that you would.  What I would appreciate since
you are requesting free assistance from the software authors
and the user community is a bit of respect and consideration.

If you really think you have found a bug, then the appropriate
place to send a detailed report of the problem and how to
reproduce it would be:

 kfw-bugs@mit.edu for MIT KFW issues

 netidmgr@secure-endpoints.com for NetIdMgr issues

 openafs-bugs@openafs.org for OpenAFS issues

Take the time to collect as much information as possible
including important details such as software versions that
you are running so that those of us who might take late
Friday night or Saturday morning to help you are not wasting
their time unnecesssarily.

> I posted to kerberos@mit.edu with the initial screenshot
> and query.=20

Be aware that e-mails with screenshots do not arrive on
the list.  They are filtered.  Text only on kerberos@mit.edu

> I followed it up with something I thought
> might be useful in order to get some help from someone,
> running Network Identity Manager with logging on.

Quite reasonable.  The log tells you exactly what operations
are being performed so you can attempt to replicate them
manually with the command line tools.

> I then tried to narrow things down a bit and ran just
> afscreds alone and got the same result.  Because I don't
> have your back+front knowledge of exactly how everything
> is pieced together, I thought that maybe this was just an
> OpenAFS problem and the original KfW problem was because
> of the OpenAFS plugin.  I posted to openafs-info.

You might have put two and two together and realized that
both NetIdMgr and AFSCred serve the same function and both
use Kerberos v5.  So the fact that both NetIdMgr and AFSCreds
experience the problem means that either they both have the
same bug or the interop problem with Cisco's software resides
somewhere lower in the stack that is common to both products.

> What a pain I am.

In a sense you are because when you ask for assistance you
do so without providing even the most basic information necessary
for anyone to be able to attempt to reproduce the problem or
research whether the problem is known to exist in the software
that you are using.

For example, I still have no idea which Cisco VPN product
you are using.  Are you using Win7 64-bit or 32-bit?  Which
KFW distribution are you using?  Is it one of my private builds
(that are supposed to be for support customers only but that
I don't protect the downloads of particularly well) or one of
the official builds from MIT that have not had a bug fix applied
in three years?  Which version of OpenAFS?

>>> I finally got around to installing OpenAFS + KfW yesterday.
>>> After installing OpenAFS + KfW, it continues to work fine until
>>> I tickle OpenAFS, at which point the VPN session drops.
>> You have an interop problem that you cannot explain but
>> how do you expect anyone to be able to help you when you
>> describe your problem in such absolute terms such as "tickle"?
>> So far you have stated:
>> 1. the problem is KFW
>> 2. the problem is NetIdMgr
> I never separated the two.  I stated that, to my apparently
> stupid eyes, my VPN connection was dying every time I ran
> KfW and tried to get credentials.  Because, uh, that's what
> I saw, with the original KfW and its Network ID Manager, and
> then as a test with the v2.0 Network ID Manager.  I tried it
> a few times, could repeat it, had never experienced it before
> under XP+OpenAFS+VPN+KfW, and queried the kerberos list to see if
> "anyone has run into this?"
> Apparently that's the same as saying "KfW is the problem."

=46rom the contents of the messages that arrived on the list
that is what I saw.

>> 3. the problem is OpenAFS because there are two loopback adapters
>> 4. the problem is the OpenAFS authentication tool, afscreds
> I never stated either of those things.
> What I said was,
>     "This appears to be an OpenAFS problem (?), as I can
>      replicate it without Network ID Manager running."
> NOTE: "appears to be" and "(?)" -- these items mean, "I
>       really don't know."
> and
>     "I have to assume the 2 loopback adapters (VPN and AFS)
>      are stomping on each other, but don't know how to fix
>      that if it's the case."
> NOTE: "I assume" "if that's the case" -- these items mean,
>       "I really don't know."

I clearly read a bit too fast and over reacted a bit.
For that I apologize.

>> 5. the problem is OpenAFS when it is tickled
>> Keep in mind that the Microsoft Loopback Adapter is active from the
>> moment that the machine boots and that the OpenAFS Service is also
>> active from boot time.  If the VPN software, which is started later
>> works for some period of time and then drops, it is most likely not
>> due to the installation of those packages.
>> In the NetIdMgr v2 log that you sent to kerberos@mit.edu, you said
>> that the VPN disconnect occurs at a particular time.  In the log at
>> that time the MSLSA credential cache is being accessed in an attempt
>> to import a TGT which is not present on your machine because you are
>> using a non-Domain logon.
>> You then later on said that the problem wasn't NetIdMgr but was instea=
>> OpenAFS because the problem occurs when you start the AFS Authenticati=
>> Tool (afscreds).  As I pointed out on the kerberos@mit.edu mailing lis=
>> afscreds is a Kerberos v5 credential manager and it also attempts to
>> import a TGT from the MSLSA: credential cache.  Both tools do so in
>> an attempt to obtains AFS tokens for the user without prompting the
>> user to enter a principal and password.
>> What I bet is that you will find that if OpenAFS is uninstalled and th=
>> loopback adapter is uninstalled and NetIdMgr is not running that the
>> problem can be reproduced by accessing the MSLSA credential cache usin=
>> the KFW command line tools:
>>    klist -c MSLSA:
>>    kdestroy -c MSLSA:
>>    ms2mit
>>    mit2ms
> Uninstalled OpenAFS + loopback adapter, Network ID Manager not
> running.
> None of these commands (issued in the order above) bring the
> VPN session down.
> kinit jblaine@RCF.OUR.ORG does, for whatever that's worth.

Its worth a hell of a lot.  Now you have narrowed down a minimal
reproducible test case.  The next question is "what is your ccache?"
Is it the MSLSA or is it something like "API:jblaine@RCF.OUR.ORG"?

If it is an API: ccache, does the problem occur if you use a FILE:


If it doesn't, then the problem might have something to do with the
RPC communication with the API: credential cache service.  If it does,
we can rule out any of the credential cache implementations and focus
on the network traffic that is performed by the krb5_32.dll library
as part of obtaining a TGT.

Unfortunately, the only way to debug the krb5_32.dll library is to
use a source code debugger.  Attach a debugger to kinit.exe, set the
command line to "jblaine@RCF.OUR.ORG" and step into the library and
execute one function at a time until the connection drops.  Then
repeat the process by going one level further with each repetition
until the Win32 call that is triggering the event is identified.

Another source of useful information would be to attach WireShark
to the VPN connection and capture the traffic that is sent on the
connection up until the connection drops.  Cisco has experienced
problems in the past with packet fragmentation of UDP packets.  This
could be a new instance of the problem.

I am fairly sure though that you can rule out any issues with OpenAFS
and NetIdMgr.

Jeffrey Altman

Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature