[OpenAFS] OpenAFS / Explorer hang when disabling/enabling NIC
Mickey Lane
mlane@sinenomine.net
Thu, 8 Jul 2010 08:52:11 -0500
--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
This is a known problem with Windows 7, any version
>From http://openafs.org/windows.html
"Windows 7 and Server 2008 R2 Specific Issues
* There is a bug in Windows that will prevent access to \\AFS<file:///\\=
AFS> after an IP address has been removed or assigned after boot. When the=
bug is triggered, all attempts to connect to \\AFS<file:///\\AFS> will res=
ult in a "Bad Network Name" error. Please reproduce this issue locally and=
submit bug reports to Microsoft."
Something you can try:
In Network and Sharing Center,
Change adapter settings,
AFS loopback properties,
Check "Link-layer Topology.." boxes (2)
Save & reboot
Repeat your experiment and when AFS fails, wait 3-5 minutes to see if it re=
covers.
On two different systems, this has worked when adding a Microsoft VPN. It h=
as not worked in other situations.
Mickey.
From: openafs-info-admin@openafs.org [mailto:openafs-info-admin@openafs.org=
] On Behalf Of Matt Renzelmann
Sent: Thursday, July 08, 2010 9:32 AM
To: openafs-info@openafs.org
Subject: [OpenAFS] OpenAFS / Explorer hang when disabling/enabling NIC
Hello,
I've observed the following issue with OpenAFS. Platform is Windows 7 x64 =
"Ultimate" with all the latest Windows Update patches. The behavior occurs=
with the last three stable releases of OpenAFS recommended for Windows: 1=
.5.75, 1.5.74, and 1.5.73. Using Network Identity Manager 2.0.0.304 per He=
lp -> About - the latest.
Details of the behavior:
- If I disable and then reenable the main network adapter--the one that AFS=
is ultimately using to access my AFS data--I observe that windows Explorer=
gets "stuck." It appears to be stuck in some kind of busy live-lock state=
.
- I suspect that if I lose my Internet connection on the same adapter for a=
ny reason, I get a similar symptom, but I've not confirmed this.
- Attempting to terminate the explorer process once it's in this state fail=
s. It will not terminate. Task Manager and Process Explorer + administrat=
ive escalation is not sufficient.
- All applications that use Explorer functionality, e.g. file open/save win=
dows, will hang as soon as they invoke said functionality.
- Rebooting resolves the problem, though I often have some difficulty reboo=
ting cleanly in this scenario.
More background:
- I'm using the DEBUG version of AFS currently in an effort to resolve this=
. I've had the problem with 1.5.74/73 using the standard "release" version=
.
- I have Process Explorer setup with symbols for AFS and Windows enabled so=
I can see full stack traces with all function names. Let me know if you w=
ant anything.
- The tail of the afsd_init.log when the problem occurs:
7/8/2010 6:54:15 AM: Mountpoint[0] =3D openafs.org#openafs.org:root.cell.
7/8/2010 6:54:15 AM: Mountpoint[1] =3D .openafs.org%openafs.org:root.cell.
7/8/2010 6:54:15 AM: Mountpoint[2] =3D .root%openafs.org:root.afs.
7/8/2010 6:54:15 AM: Mountpoint[3] =3D cs.wisc.edu#cs.wisc.edu:root.cell.
7/8/2010 7:35:15 AM: smb_LanAdapterChange
7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:35 AM: smb_LanAdapterChange
7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:35 AM: smb_LanAdapterChange
7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:35:58 AM: smb_LanAdapterChange
7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
7/8/2010 7:36:03 AM: smb_LanAdapterChange
7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with NRC_BRIDGE, retrying ..=
.
7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with NRC_NOWILD, retrying ..=
.
- The log clearly shows me disabling/enabling the main network adapter. No=
te that I disabled it once, then re-enabled it once a few seconds later.
- Let me know if you'd like more of the log--I've saved a copy.
- Example of the Explorer process after I've attempted to terminate it:
http://www.renzelmann.com/temp/explorer.png
It hangs with these threads running indefinitely. Note that they are doing=
something as they are consuming CPU, but they will not terminate. Explore=
r normally contains many additional threads--these have exited cleanly in t=
his screenshot.
- System configuration includes:
* A wireless adapter. The Wireless adapter is enabled but not in use or =
connected.
* A wired adapter. The wired adapter is used for network/Internet.
* Several VMware Workstation 7 Virtual NICs.
* A virtual Hamachi VPN NIC. The VPN adapter is in use, but I doubt is t=
he cause as I've had this issue before I installed Hamachi.
* The OpenAFS Loopback adapter.
- I can reproduce the problem easily by disabling the wired adapter and the=
n reenabling it, and also attempt to access a mapped AFS drive in Windows E=
xplorer.
- I never have any problems if I leave the OpenAFS service disabled and hav=
e no drives mapped, so I am certain that an important part of the problem i=
s something OpenAFS is doing--perhaps it's conflicting with something else?
Does anyone have any recommendations on how to proceed to get OpenAFS worki=
ng reliably with this setup? Do you need any additional information?
Thanks and regards,
Matt
--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3DContent-Type content=3D"text/html; charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:844906969;
mso-list-template-ids:-595165274;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3DEN-US link=3Dblue vlink=3Dpurple>
<div class=3DWordSection1>
<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
Roman","serif"'>This
is a known problem with Windows 7, any version<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
Roman","serif"'><o:p> </o:p></span></p>
<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
Roman","serif"'>From
<a href=3D"http://openafs.org/windows.html">http://openafs.org/windows.html=
</a><o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
Roman","serif"'><o:p> </o:p></span></p>
<p class=3DMsoNormal><span style=3D'font-size:12.0pt;font-family:"Times New=
Roman","serif"'>“Windows
7 and Server 2008 R2 Specific Issues<o:p></o:p></span></p>
<ul type=3Ddisc>
<li class=3DMsoNormal style=3D'mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto;
mso-list:l0 level1 lfo1'><span style=3D'font-size:12.0pt;font-family:"=
Times New Roman","serif"'>There
is a bug in Windows that will prevent access to <a href=3D"file:///\\A=
FS">\\AFS</a>
after an IP address has been removed or assigned after boot. Whe=
n
the bug is triggered, all attempts to connect to <a href=3D"file:///\\=
AFS">\\AFS</a>
will result in a "Bad Network Name" error. Please
reproduce this issue locally and submit bug reports to Microsoft.̶=
1;<o:p></o:p></span></li>
</ul>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Something you can try:<o=
:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p> </o:p></span>=
</p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>In Network and Sharing C=
enter,<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Change adapter settings,=
<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>AFS loopback properties,=
<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Check “Link-layer
Topology..” boxes (2)<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Save & reboot<o:p></=
o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Repeat your experiment a=
nd when
AFS fails, wait 3-5 minutes to see if it recovers.<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p> </o:p></span>=
</p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>On two different systems=
, this
has worked when adding a Microsoft VPN. It has not worked in other situatio=
ns.<o:p></o:p></span></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p> </o:p></span>=
</p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'>Mickey.<o:p></o:p></span=
></p>
<p class=3DMsoNormal><span style=3D'color:#1F497D'><o:p> </o:p></span>=
</p>
<div style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in =
4.0pt'>
<div>
<div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in =
0in 0in'>
<p class=3DMsoNormal><b><span style=3D'font-size:10.0pt;font-family:"Tahoma=
","sans-serif"'>From:</span></b><span
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> openafs-info-=
admin@openafs.org
[mailto:openafs-info-admin@openafs.org] <b>On Behalf Of </b>Matt Renzelmann=
<br>
<b>Sent:</b> Thursday, July 08, 2010 9:32 AM<br>
<b>To:</b> openafs-info@openafs.org<br>
<b>Subject:</b> [OpenAFS] OpenAFS / Explorer hang when disabling/enabling N=
IC<o:p></o:p></span></p>
</div>
</div>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>Hello,<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>I've observed the following issue with OpenAFS.
Platform is Windows 7 x64 "Ultimate" with all the latest Windows
Update patches. The behavior occurs with the last three stable releas=
es
of OpenAFS recommended for Windows: 1.5.75, 1.5.74, and 1.5.73.
Using Network Identity Manager 2.0.0.304 per Help -> About – the
latest.<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>Details of the behavior:<o:p></o:p></p>
<p class=3DMsoNormal>- If I disable and then reenable the main network
adapter--the one that AFS is ultimately using to access my AFS data--I obse=
rve
that windows Explorer gets "stuck." It appears to be stuck =
in
some kind of busy live-lock state.<o:p></o:p></p>
<p class=3DMsoNormal>- I suspect that if I lose my Internet connection on t=
he
same adapter for any reason, I get a similar symptom, but I've not confirme=
d
this.<o:p></o:p></p>
<p class=3DMsoNormal>- Attempting to terminate the explorer process once it=
's in
this state fails. It will not terminate. Task Manager and Proce=
ss
Explorer + administrative escalation is not sufficient.<o:p></o:p></p>
<p class=3DMsoNormal>- All applications that use Explorer functionality, e.=
g.
file open/save windows, will hang as soon as they invoke said functionality=
.<o:p></o:p></p>
<p class=3DMsoNormal>- Rebooting resolves the problem, though I often have =
some
difficulty rebooting cleanly in this scenario.<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>More background:<o:p></o:p></p>
<p class=3DMsoNormal>- I'm using the DEBUG version of AFS currently in an e=
ffort
to resolve this. I've had the problem with 1.5.74/73 using the standa=
rd
"release" version.<o:p></o:p></p>
<p class=3DMsoNormal>- I have Process Explorer setup with symbols for AFS a=
nd
Windows enabled so I can see full stack traces with all function names.&nbs=
p;
Let me know if you want anything.<o:p></o:p></p>
<p class=3DMsoNormal>- The tail of the afsd_init.log when the problem occur=
s:<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[0] =3D
openafs.org#openafs.org:root.cell.<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[1] =3D
.openafs.org%openafs.org:root.cell.<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[2] =3D
.root%openafs.org:root.afs.<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 6:54:15 AM: Mountpoint[3] =3D
cs.wisc.edu#cs.wisc.edu:root.cell.<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:15 AM: smb_LanAdapterChange<o:p></o:p></=
p>
<p class=3DMsoNormal>7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:15 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:35 AM: smb_LanAdapterChange<o:p></o:p></=
p>
<p class=3DMsoNormal>7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:35 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:35 AM: smb_LanAdapterChange<o:p></o:p></=
p>
<p class=3DMsoNormal>7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:38 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:58 AM: smb_LanAdapterChange<o:p></o:p></=
p>
<p class=3DMsoNormal>7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:35:58 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:36:03 AM: smb_LanAdapterChange<o:p></o:p></=
p>
<p class=3DMsoNormal>7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with
NRC_BRIDGE, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal>7/8/2010 7:36:03 AM: NCBLISTEN lana=3D4 failed with
NRC_NOWILD, retrying ...<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>- The log clearly shows me disabling/enabling the main
network adapter. Note that I disabled it once, then re-enabled it onc=
e a
few seconds later.<o:p></o:p></p>
<p class=3DMsoNormal>- Let me know if you'd like more of the log--I've save=
d a
copy.<o:p></o:p></p>
<p class=3DMsoNormal>- Example of the Explorer process after I've attempted=
to
terminate it:<o:p></o:p></p>
<p class=3DMsoNormal>http://www.renzelmann.com/temp/explorer.png<o:p></o:p>=
</p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>It hangs with these threads running indefinitely. =
; Note
that they are doing something as they are consuming CPU, but they will not
terminate. Explorer normally contains many additional threads--these =
have
exited cleanly in this screenshot.<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>- System configuration includes:<o:p></o:p></p>
<p class=3DMsoNormal> * A wireless adapter. The Wireless adapte=
r is
enabled but not in use or connected.<o:p></o:p></p>
<p class=3DMsoNormal> * A wired adapter. The wired adapter is u=
sed
for network/Internet.<o:p></o:p></p>
<p class=3DMsoNormal> * Several VMware Workstation 7 Virtual NICs.<o:=
p></o:p></p>
<p class=3DMsoNormal> * A virtual Hamachi VPN NIC. The VPN adap=
ter is
in use, but I doubt is the cause as I've had this issue before I installed
Hamachi.<o:p></o:p></p>
<p class=3DMsoNormal> * The OpenAFS Loopback adapter.<o:p></o:p></p>
<p class=3DMsoNormal>- I can reproduce the problem easily by disabling the =
wired
adapter and then reenabling it, and also attempt to access a mapped AFS dri=
ve
in Windows Explorer.<o:p></o:p></p>
<p class=3DMsoNormal>- I never have any problems if I leave the OpenAFS ser=
vice
disabled and have no drives mapped, so I am certain that an important part =
of
the problem is something OpenAFS is doing--perhaps it's conflicting with
something else?<o:p></o:p></p>
<p class=3DMsoNormal><o:p> </o:p></p>
<p class=3DMsoNormal>Does anyone have any recommendations on how to proceed=
to
get OpenAFS working reliably with this setup? Do you need any additio=
nal
information?<o:p></o:p></p>
<p class=3DMsoNormal>Thanks and regards,<o:p></o:p></p>
<p class=3DMsoNormal>Matt<o:p></o:p></p>
</div>
</div>
</body>
</html>
--_000_1171E06FDB4D8341937880DCD831A0E20164D0223434093MBXC15me_--