[OpenAFS] OpenAFS / Explorer hang when disabling/enabling NIC

Mickey Lane mlane@sinenomine.net
Thu, 8 Jul 2010 08:52:11 -0500


--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

This is a known problem with Windows 7, any version

>From http://openafs.org/windows.html

"Windows 7 and Server 2008 R2 Specific Issues

 *   There is a bug in Windows that will prevent access to \\AFS<file:///\\=
AFS> after an IP address has been removed or assigned after boot.  When the=
 bug is triggered, all attempts to connect to \\AFS<file:///\\AFS> will res=
ult in a "Bad Network Name" error.  Please reproduce this issue locally and=
 submit bug reports to Microsoft."
Something you can try:

In Network and Sharing Center,
Change adapter settings,
AFS loopback properties,
Check "Link-layer Topology.." boxes (2)
Save & reboot
Repeat your experiment and when AFS fails, wait 3-5 minutes to see if it re=
covers.

On two different systems, this has worked when adding a Microsoft VPN. It h=
as not worked in other situations.

Mickey.

From: openafs-info-admin@openafs.org [mailto:openafs-info-admin@openafs.org=
] On Behalf Of Matt Renzelmann
Sent: Thursday, July 08, 2010 9:32 AM
To: openafs-info@openafs.org
Subject: [OpenAFS] OpenAFS / Explorer hang when disabling/enabling NIC

Hello,

I've observed the following issue with OpenAFS.  Platform is Windows 7 x64 =
"Ultimate" with all the latest Windows Update patches.  The behavior occurs=
 with the last three stable releases of OpenAFS recommended for Windows:  1=
.5.75, 1.5.74, and 1.5.73.  Using Network Identity Manager 2.0.0.304 per He=
lp -> About - the latest.

Details of the behavior:
- If I disable and then reenable the main network adapter--the one that AFS=
 is ultimately using to access my AFS data--I observe that windows Explorer=
 gets "stuck."  It appears to be stuck in some kind of busy live-lock state=
.
- I suspect that if I lose my Internet connection on the same adapter for a=
ny reason, I get a similar symptom, but I've not confirmed this.
- Attempting to terminate the explorer process once it's in this state fail=
s.  It will not terminate.  Task Manager and Process Explorer + administrat=
ive escalation is not sufficient.
- All applications that use Explorer functionality, e.g. file open/save win=
dows, will hang as soon as they invoke said functionality.
- Rebooting resolves the problem, though I often have some difficulty reboo=
ting cleanly in this scenario.

More background:
- I'm using the DEBUG version of AFS currently in an effort to resolve this=
.  I've had the problem with 1.5.74/73 using the standard "release" version=
.
- I have Process Explorer setup with symbols for AFS and Windows enabled so=
 I can see full stack traces with all function names.  Let me know if you w=
ant anything.
- The tail of the afsd_init.log when the problem occurs:
7/8/2010 6:54:15 AM: Mountpoint[0] =3D openafs.org#openafs.org:root.cell.
7/8/2010 6:54:15 AM: Mountpoint[1] =3D .openafs.org%openafs.org:root.cell.
7/8/2010 6:54:15 AM: Mountpoint[2] =3D .root%openafs.org:root.afs.
7/8/2010 6:54:15 AM: Mountpoint[3] =3D cs.wisc.edu#cs.wisc.edu:root.cell.
7/8/2010 7:35:15 AM: smb_LanAdapterChange
7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:35 AM: smb_LanAdapterChange
7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:35 AM: smb_LanAdapterChange
7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:58 AM: smb_LanAdapterChange
7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:36:03 AM: smb_LanAdapterChange
7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.

- The log clearly shows me disabling/enabling the main network adapter.  No=
te that I disabled it once, then re-enabled it once a few seconds later.
- Let me know if you'd like more of the log--I've saved a copy.
- Example of the Explorer process after I've attempted to terminate it:
http://www.renzelmann.com/temp/explorer.png

It hangs with these threads running indefinitely.  Note that they are doing=
 something as they are consuming CPU, but they will not terminate.  Explore=
r normally contains many additional threads--these have exited cleanly in t=
his screenshot.

- System configuration includes:
  * A wireless adapter.  The Wireless adapter is enabled but not in use or =
connected.
  * A wired adapter.  The wired adapter is used for network/Internet.
  * Several VMware Workstation 7 Virtual NICs.
  * A virtual Hamachi VPN NIC.  The VPN adapter is in use, but I doubt is t=
he cause as I've had this issue before I installed Hamachi.
  * The OpenAFS Loopback adapter.
- I can reproduce the problem easily by disabling the wired adapter and the=
n reenabling it, and also attempt to access a mapped AFS drive in Windows E=
xplorer.
- I never have any problems if I leave the OpenAFS service disabled and hav=
e no drives mapped, so I am certain that an important part of the problem i=
s something OpenAFS is doing--perhaps it's conflicting with something else?

Does anyone have any recommendations on how to proceed to get OpenAFS worki=
ng reliably with this setup?  Do you need any additional information?
Thanks and regards,
Matt

--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)">
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:Wingdings;
	panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
	{mso-style-priority:34;
	margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:.5in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";}
span.EmailStyle18
	{mso-style-type:personal;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
span.EmailStyle19
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
 /* List Definitions */
 @list l0
	{mso-list-id:844906969;
	mso-list-template-ids:-595165274;}
@list l0:level1
	{mso-level-number-format:bullet;
	mso-level-text:\F0B7;
	mso-level-tab-stop:.5in;
	mso-level-number-position:left;
	text-indent:-.25in;
	mso-ansi-font-size:10.0pt;
	font-family:Symbol;}
ol
	{margin-bottom:0in;}
ul
	{margin-bottom:0in;}
-->
</style>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext=3D"edit">
  <o:idmap v:ext=3D"edit" data=3D"1" />
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DWordSection1>

<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
 Roman","serif"'>This
is a known problem with Windows 7, any version<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
 Roman","serif"'><o:p>&nbsp;</o:p></span></p>

<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
 Roman","serif"'>From
<a href=3D"http://openafs.org/windows.html">http://openafs.org/windows.html=
</a><o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
 Roman","serif"'><o:p>&nbsp;</o:p></span></p>

<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
 Roman","serif"'>&#8220;Windows
7 and Server 2008 R2 Specific Issues<o:p></o:p></span></p>

<ul type=3Ddisc>
 <li class=3DMsoNormal style=3D'mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto;
     mso-list:l0 level1 lfo1'><span style=3D'font-size:12.0pt;font-family:"=
Times New Roman","serif"'>There
     is a bug in Windows that will prevent access to <a href=3D"file:///\\A=
FS">\\AFS</a>
     after an IP address has been removed or assigned after boot.&nbsp; Whe=
n
     the bug is triggered, all attempts to connect to <a href=3D"file:///\\=
AFS">\\AFS</a>
     will result in a &quot;Bad Network Name&quot; error.&nbsp; Please
     reproduce this issue locally and submit bug reports to Microsoft.&#822=
1;<o:p></o:p></span></li>
</ul>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Something you can try:<o=
:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span>=
</p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>In Network and Sharing C=
enter,<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Change adapter settings,=
<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>AFS loopback properties,=
<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Check &#8220;Link-layer
Topology..&#8221; boxes (2)<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Save &amp; reboot<o:p></=
o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Repeat your experiment a=
nd when
AFS fails, wait 3-5 minutes to see if it recovers.<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span>=
</p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>On two different systems=
, this
has worked when adding a Microsoft VPN. It has not worked in other situatio=
ns.<o:p></o:p></span></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span>=
</p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'>Mickey.<o:p></o:p></span=
></p>

<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span>=
</p>

<div style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in =
4.0pt'>

<div>

<div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in =
0in 0in'>

<p class=3DMsoNormal><b><span style=3D'font-size:10.0pt;font-family:"Tahoma=
","sans-serif"'>From:</span></b><span
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> openafs-info-=
admin@openafs.org
[mailto:openafs-info-admin@openafs.org] <b>On Behalf Of </b>Matt Renzelmann=
<br>
<b>Sent:</b> Thursday, July 08, 2010 9:32 AM<br>
<b>To:</b> openafs-info@openafs.org<br>
<b>Subject:</b> [OpenAFS] OpenAFS / Explorer hang when disabling/enabling N=
IC<o:p></o:p></span></p>

</div>

</div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>Hello,<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>I've observed the following issue with OpenAFS.&nbsp;
Platform is Windows 7 x64 &quot;Ultimate&quot; with all the latest Windows
Update patches.&nbsp; The behavior occurs with the last three stable releas=
es
of OpenAFS recommended for Windows:&nbsp; 1.5.75, 1.5.74, and 1.5.73.&nbsp;
Using Network Identity Manager 2.0.0.304 per Help -&gt; About &#8211; the
latest.<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>Details of the behavior:<o:p></o:p></p>

<p class=3DMsoNormal>- If I disable and then reenable the main network
adapter--the one that AFS is ultimately using to access my AFS data--I obse=
rve
that windows Explorer gets &quot;stuck.&quot;&nbsp; It appears to be stuck =
in
some kind of busy live-lock state.<o:p></o:p></p>

<p class=3DMsoNormal>- I suspect that if I lose my Internet connection on t=
he
same adapter for any reason, I get a similar symptom, but I've not confirme=
d
this.<o:p></o:p></p>

<p class=3DMsoNormal>- Attempting to terminate the explorer process once it=
's in
this state fails.&nbsp; It will not terminate.&nbsp; Task Manager and Proce=
ss
Explorer + administrative escalation is not sufficient.<o:p></o:p></p>

<p class=3DMsoNormal>- All applications that use Explorer functionality, e.=
g.
file open/save windows, will hang as soon as they invoke said functionality=
.<o:p></o:p></p>

<p class=3DMsoNormal>- Rebooting resolves the problem, though I often have =
some
difficulty rebooting cleanly in this scenario.<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>More background:<o:p></o:p></p>

<p class=3DMsoNormal>- I'm using the DEBUG version of AFS currently in an e=
ffort
to resolve this.&nbsp; I've had the problem with 1.5.74/73 using the standa=
rd
&quot;release&quot; version.<o:p></o:p></p>

<p class=3DMsoNormal>- I have Process Explorer setup with symbols for AFS a=
nd
Windows enabled so I can see full stack traces with all function names.&nbs=
p;
Let me know if you want anything.<o:p></o:p></p>

<p class=3DMsoNormal>- The tail of the afsd_init.log when the problem occur=
s:<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[0] =3D
openafs.org#openafs.org:root.cell.<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[1] =3D
.openafs.org%openafs.org:root.cell.<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[2] =3D
.root%openafs.org:root.afs.<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[3] =3D
cs.wisc.edu#cs.wisc.edu:root.cell.<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:15 AM: smb_LanAdapterChange<o:p></o:p></=
p>

<p class=3DMsoNormal>7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:35 AM: smb_LanAdapterChange<o:p></o:p></=
p>

<p class=3DMsoNormal>7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:35 AM: smb_LanAdapterChange<o:p></o:p></=
p>

<p class=3DMsoNormal>7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:58 AM: smb_LanAdapterChange<o:p></o:p></=
p>

<p class=3DMsoNormal>7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:36:03 AM: smb_LanAdapterChange<o:p></o:p></=
p>

<p class=3DMsoNormal>7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal>7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>- The log clearly shows me disabling/enabling the main
network adapter.&nbsp; Note that I disabled it once, then re-enabled it onc=
e a
few seconds later.<o:p></o:p></p>

<p class=3DMsoNormal>- Let me know if you'd like more of the log--I've save=
d a
copy.<o:p></o:p></p>

<p class=3DMsoNormal>- Example of the Explorer process after I've attempted=
 to
terminate it:<o:p></o:p></p>

<p class=3DMsoNormal>http://www.renzelmann.com/temp/explorer.png<o:p></o:p>=
</p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>It hangs with these threads running indefinitely.&nbsp=
; Note
that they are doing something as they are consuming CPU, but they will not
terminate.&nbsp; Explorer normally contains many additional threads--these =
have
exited cleanly in this screenshot.<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>- System configuration includes:<o:p></o:p></p>

<p class=3DMsoNormal>&nbsp; * A wireless adapter.&nbsp; The Wireless adapte=
r is
enabled but not in use or connected.<o:p></o:p></p>

<p class=3DMsoNormal>&nbsp; * A wired adapter.&nbsp; The wired adapter is u=
sed
for network/Internet.<o:p></o:p></p>

<p class=3DMsoNormal>&nbsp; * Several VMware Workstation 7 Virtual NICs.<o:=
p></o:p></p>

<p class=3DMsoNormal>&nbsp; * A virtual Hamachi VPN NIC.&nbsp; The VPN adap=
ter is
in use, but I doubt is the cause as I've had this issue before I installed
Hamachi.<o:p></o:p></p>

<p class=3DMsoNormal>&nbsp; * The OpenAFS Loopback adapter.<o:p></o:p></p>

<p class=3DMsoNormal>- I can reproduce the problem easily by disabling the =
wired
adapter and then reenabling it, and also attempt to access a mapped AFS dri=
ve
in Windows Explorer.<o:p></o:p></p>

<p class=3DMsoNormal>- I never have any problems if I leave the OpenAFS ser=
vice
disabled and have no drives mapped, so I am certain that an important part =
of
the problem is something OpenAFS is doing--perhaps it's conflicting with
something else?<o:p></o:p></p>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>Does anyone have any recommendations on how to proceed=
 to
get OpenAFS working reliably with this setup?&nbsp; Do you need any additio=
nal
information?<o:p></o:p></p>

<p class=3DMsoNormal>Thanks and regards,<o:p></o:p></p>

<p class=3DMsoNormal>Matt<o:p></o:p></p>

</div>

</div>

</body>

</html>

--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_--