[OpenAFS] BIG Problem Setup AFS Multihomed Server

Sven Oehme oehmes@de.ibm.com
Wed, 29 Jan 2003 17:11:40 +0100


This is a multipart message in MIME format.
--=_alternative 0058F452C1256CBD_=
Content-Type: text/plain; charset="US-ASCII"

hy ,

yes i tried that already , it worked , but after the reboot it hasn't 
worked any more , but i have no clue why !

is this an AIX Bug , or an Openafs bug ?

under AIX 5.1 with Transarc AFS 3.6 PTF 5 i have the same Setup and it is 
working , so my assumption is , this is a code defect in OpenAFS ,
is anybody able to look at it , or at least tell me the code Part i should 
check.

Thanks . Sven






"Broughton, Travis V" <tvb@intel.com>
01/29/2003 03:59 PM
 
        To:     Sven Oehme/Germany/IBM@IBMDE
        cc: 
        Subject:        RE: [OpenAFS] BIG Problem Setup AFS Multihomed 
Server

 

Have you tried swapping the order of the interfaces on the NICs?  That is, 
put 192.x on en3 and 126.x on en4?  I seem to recall an issue a while back 
where AIX really wanted to report the first interface it found - we found 
that swapping interfaces helped resolve an issue.  My brain's fuzzy on 
whether this was exactly the same situation as yours, but it may be worth 
trying.
 
-tvb
-----Original Message-----
From: Sven Oehme [mailto:oehmes@de.ibm.com] 
Sent: Wednesday, January 29, 2003 3:00 AM
To: openafs-info@openafs.org
Subject: [OpenAFS] BIG Problem Setup AFS Multihomed Server


Hy, 

i am working now since a week on a Problem setting up a Multihomed file 
and Database Server and become crazy .. 

my Scenario is the following : 




     |-----| 126.X |---------|192.X |-----| 192.X |------------|  9.X 
|-----| 
     |     | ----- |CellServ1| ---- |     | ----- |AFS Client 1| ---- |  | 

     | B   |       |---------|      |  A  |       |------------|       | P 
 | 
     | A   |                        |  F  |                            | L 
 |  9.X         |-------------| 
     | C   | 126.X |---------|192.X |  S  | 192.X |------------|  9.X  | A 
 |  ------------|Samba clients| 
     | K   | ----- |CellServ1| ---- |     | ----- |AFS Client 1| ---- |  N 
 |              |-------------| 
     | U   |       |---------|      |  L  |       |------------|       | T 
 | 
     | P   |                        |  A  |                            |  
| 
     |     |                        |  N  |                            | L 
 | 
     | L   | 126.X |---------|192.X |     |                            | A 
 | 
     | A   | ----- |FileServ1| ---- |     |                            | N 
 | 
    | N   |       |---------|      |     |                            |  | 
    
     |     |                        |     |                            |  
|     
     |126.X|                        |192.X|                            | 
9.X | 
     |-----|                        |-----| |-----|     



so for Explanation , 

i like to setup the CellServers and Fileserver connected to the 126.X and 
the 192.X Lan (but only use the 192.X Range for AFS) 
the Clients read the Data from the FileServer and share the Data with 
Samba to the Plant Lan . 

The CellServDB look like this for all Clients and Servers in /usr/afs/etc/ 
and /usr/vice/etc: 

>testme.org                #test Cell 
192.168.180.101        #Cellserv1.afslan.com 
192.168.180.102        #Cellserv2.afslan.com 

my /etc/hosts file is empty (only the 127.0.0.1 with localhost) , but 
forward and reverse 
look-up's are possible and correct for 126.x and 192.x network trough DNS 

so 'host cellserv1' returns 192.168.180.101, also 'host 
cellserv1.afslan.com' is pointing to 192.168.180.101 
and reverse is also working 'host 192.168.180.101' is pointing to 
cellserv1.afslan.com 

so name resolution is not a Problem (and i also tried to put everything in 
the /etc/hosts with the same result ). 

the Interface configuration of cellserv1 for examples is  : 

en0 empty (not used and down) 
en1 empty (not used and down) 
en2 has ip 126.201.100.241 subnetmask 255.255.0.0 and gateway 
126.201.100.9 
en3 has ip 192.168.180.11 subnetmask 255.255.0.0 and no gateway 

so , when i now try to start only this one host  i get a lot off ubik 
errors . 

looking in the logs shows me , that 192.168.180.11 is not the primary ip 
address 
buserver , Fileserver , salvager ...  are dieing every 10 -15 sec. and 
restart 

when is bos shutdown the server remove the 192.X ip , change the entry in 
the Cellservdb to 126.X bos startup the Server , everything is working , 
but with the 126.X  ip. 
then i tried to use /usr/afs/local/NetRestict  and /usr/afs/local/NetInfo 
, same result , with the 192.X configured 

cat /usr/afs/local/NetRestrict  reports 126.201.100.241  and cat 
/usr/afs/local/NetInfo reports 192.168.180.11 

no change , he only reports me every time 192.X is not the primary address 
. 

when i remove the 126.X ip everything is back working . 

it is also working , when i setup /etc/resolv.conf that 'host cellserv1' 
and reports the 126.X address and change the CellServDB to the 126.X ip's 
and start the Server 
than starts correct , but only registers the 126.X ip in the VLDB . 

so is it possible , that this is a bug in the OPENAFS AIX code ? because i 
installed with the same setup a Linux Box and everything works fine . 
I also reinstalled the whole AIX Box , so i think there is no Problem with 
the installation itself . 
it also looks like /usr/afs/local/NetRestict is not used for IP exclusions 
. 

i am running AIX 4.3.3 ML10 

Some Hints or Help would be Great .. 

Sven 

--=_alternative 0058F452C1256CBD_=
Content-Type: text/html; charset="US-ASCII"


<br><font size=2 face="sans-serif">hy ,</font>
<br>
<br><font size=2 face="sans-serif">yes i tried that already , it worked
, but after the reboot it hasn't worked any more , but i have no clue why
!</font>
<br>
<br><font size=2 face="sans-serif">is this an AIX Bug , or an Openafs bug
?</font>
<br>
<br><font size=2 face="sans-serif">under AIX 5.1 with Transarc AFS 3.6
PTF 5 i have the same Setup and it is working , so my assumption is , this
is a code defect in OpenAFS ,</font>
<br><font size=2 face="sans-serif">is anybody able to look at it , or at
least tell me the code Part i should check.</font>
<br>
<br><font size=2 face="sans-serif">Thanks . Sven</font>
<br>
<br>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td>
<td><font size=1 face="sans-serif"><b>&quot;Broughton, Travis V&quot; &lt;tvb@intel.com&gt;</b></font>
<p><font size=1 face="sans-serif">01/29/2003 03:59 PM</font>
<td><font size=1 face="Arial">&nbsp; &nbsp; &nbsp; &nbsp; </font>
<br><font size=1 face="sans-serif">&nbsp; &nbsp; &nbsp; &nbsp; To:
&nbsp; &nbsp; &nbsp; &nbsp;Sven Oehme/Germany/IBM@IBMDE</font>
<br><font size=1 face="sans-serif">&nbsp; &nbsp; &nbsp; &nbsp; cc:
&nbsp; &nbsp; &nbsp; &nbsp;</font>
<br><font size=1 face="sans-serif">&nbsp; &nbsp; &nbsp; &nbsp; Subject:
&nbsp; &nbsp; &nbsp; &nbsp;RE: [OpenAFS] BIG Problem Setup AFS
Multihomed Server</font>
<br>
<br><font size=1 face="Arial">&nbsp; &nbsp; &nbsp; &nbsp;</font></table>
<br>
<br><font size=2 color=blue>Have you tried swapping the order of the interfaces
on the NICs? &nbsp;That is, put 192.x on en3 and 126.x on en4? &nbsp;I
seem to recall an issue a while back where AIX really wanted to report
the first interface it found - we found that swapping interfaces helped
resolve an issue. &nbsp;My brain's fuzzy on whether this was exactly the
same situation as yours, but it may be worth trying.</font>
<br><font size=3>&nbsp;</font>
<br><font size=2 color=blue>-tvb</font>
<br><font size=2 face="Tahoma">-----Original Message-----<b><br>
From:</b> Sven Oehme [mailto:oehmes@de.ibm.com] <b><br>
Sent:</b> Wednesday, January 29, 2003 3:00 AM<b><br>
To:</b> openafs-info@openafs.org<b><br>
Subject:</b> [OpenAFS] BIG Problem Setup AFS Multihomed Server<br>
</font>
<br><font size=2 face="sans-serif"><br>
Hy, </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
i am working now since a week on a Problem setting up a Multihomed file
and Database Server and become crazy ..</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
my Scenario is the following :</font><font size=3> <br>
<br>
<br>
<br>
</font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; |-----| 126.X |---------|192.X |-----| 192.X |------------|
&nbsp;9.X &nbsp;|-----| &nbsp; <br>
 &nbsp; &nbsp; | &nbsp; &nbsp; | ----- |CellServ1| ---- | &nbsp; &nbsp;
| ----- |AFS Client 1| &nbsp; &nbsp; &nbsp; &nbsp;---- | &nbsp; &nbsp;
| <br>
 &nbsp; &nbsp; | B &nbsp; | &nbsp; &nbsp; &nbsp; |---------| &nbsp; &nbsp;
&nbsp;| &nbsp;A &nbsp;| &nbsp; &nbsp; &nbsp; |------------| &nbsp; &nbsp;
&nbsp; | &nbsp;P &nbsp;|</font><font size=3> </font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | A &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp;F &nbsp;| &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|
&nbsp;L &nbsp;| &nbsp;9.X &nbsp; &nbsp; &nbsp; &nbsp; |-------------|</font><font size=3>
</font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | C &nbsp; | 126.X |---------|192.X | &nbsp;S &nbsp;| 192.X
|------------| &nbsp;9.X &nbsp;| &nbsp;A &nbsp;| &nbsp;------------|Samba
clients|</font><font size=3> </font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | K &nbsp; | ----- |CellServ1| ---- | &nbsp; &nbsp; | -----
|AFS Client 1| &nbsp; &nbsp; &nbsp; &nbsp;---- | &nbsp;N &nbsp;| &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|-------------|</font><font size=3>
</font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | U &nbsp; | &nbsp; &nbsp; &nbsp; |---------| &nbsp; &nbsp;
&nbsp;| &nbsp;L &nbsp;| &nbsp; &nbsp; &nbsp; |------------| &nbsp; &nbsp;
&nbsp; | &nbsp;T &nbsp;|</font><font size=3> </font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | P &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp;A &nbsp;| &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|
&nbsp; &nbsp; | <br>
 &nbsp; &nbsp; | &nbsp; &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp;N &nbsp;| &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp;| &nbsp;L &nbsp;| <br>
 &nbsp; &nbsp; | L &nbsp; | 126.X |---------|192.X | &nbsp; &nbsp; | &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp;| &nbsp;A &nbsp;| <br>
 &nbsp; &nbsp; | A &nbsp; | ----- |FileServ1| ---- | &nbsp; &nbsp; | &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp;| &nbsp;N &nbsp;| &nbsp; &nbsp;<br>
 &nbsp; &nbsp;| N &nbsp; | &nbsp; &nbsp; &nbsp; |---------| &nbsp; &nbsp;
&nbsp;| &nbsp; &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp; &nbsp; | &nbsp;
&nbsp;</font><font size=3> </font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; | &nbsp; &nbsp; | &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp; &nbsp; | &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp;| &nbsp; &nbsp; | &nbsp; &nbsp;</font><font size=3> </font><font size=2 face="Courier"><br>
 &nbsp; &nbsp; |126.X| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|192.X| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| 9.X | <br>
 &nbsp; &nbsp; |-----| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|-----| &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|-----| &nbsp;
&nbsp; </font><font size=3><br>
<br>
<br>
</font><font size=2 face="Courier"><br>
so for Explanation , </font><font size=3><br>
</font><font size=2 face="Courier"><br>
i like to setup the CellServers and Fileserver connected to the 126.X and
the 192.X Lan (but only use the 192.X Range for AFS)</font><font size=3>
</font><font size=2 face="Courier"><br>
the Clients read the Data from the FileServer and share the Data with Samba
to the Plant Lan .</font><font size=3> <br>
</font><font size=2 face="Courier"><br>
The CellServDB look like this for all Clients and Servers in /usr/afs/etc/
and /usr/vice/etc:</font><font size=3> <br>
</font><font size=2 face="Courier"><br>
&gt;testme.org &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#test
Cell</font><font size=3> </font><font size=2 face="Courier"><br>
192.168.180.101 &nbsp; &nbsp; &nbsp; &nbsp;#Cellserv1.afslan.com</font><font size=3>
</font><font size=2 face="Courier"><br>
192.168.180.102 &nbsp; &nbsp; &nbsp; &nbsp;#Cellserv2.afslan.com</font><font size=3>
<br>
</font><font size=2 face="Courier"><br>
my /etc/hosts file is empty (only the 127.0.0.1 with localhost) , but forward
and reverse <br>
look-up's are possible and correct for 126.x and 192.x network trough DNS</font><font size=3>
<br>
</font><font size=2 face="Courier"><br>
so 'host cellserv1' returns 192.168.180.101, also 'host cellserv1.afslan.com'
is pointing to 192.168.180.101</font><font size=3> </font><font size=2 face="Courier"><br>
and reverse is also working 'host 192.168.180.101' is pointing to cellserv1.afslan.com
</font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
so name resolution is not a Problem (and i also tried to put everything
in the /etc/hosts with the same result ).</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
the Interface configuration of cellserv1 for examples is &nbsp;:</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
en0 empty (not used and down)</font><font size=3> </font><font size=2 face="sans-serif"><br>
en1 empty (not used and down)</font><font size=3> </font><font size=2 face="sans-serif"><br>
en2 has ip 126.201.100.241 subnetmask 255.255.0.0 and gateway 126.201.100.9</font><font size=3>
</font><font size=2 face="sans-serif"><br>
en3 has ip 192.168.180.11 subnetmask 255.255.0.0 and no gateway</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
so , when i now try to start only this one host &nbsp;i get a lot off ubik
errors .</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
looking in the logs shows me , that 192.168.180.11 is not the primary ip
address <br>
buserver , Fileserver , salvager ... &nbsp;are dieing every 10 -15 sec.
and restart </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
when is bos shutdown the server remove the 192.X ip , change the entry
in the Cellservdb to 126.X bos startup the Server , everything is working
, but with the 126.X &nbsp;ip.</font><font size=3> </font><font size=2 face="sans-serif"><br>
then i tried to use /usr/afs/local/NetRestict &nbsp;and /usr/afs/local/NetInfo
&nbsp;, same result , with the 192.X configured </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
cat /usr/afs/local/NetRestrict &nbsp;reports 126.201.100.241 &nbsp;and
cat /usr/afs/local/NetInfo reports 192.168.180.11 </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
no change , he only reports me every time 192.X is not the primary address
.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
when i remove the 126.X ip everything is back working .</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
it is also working , when i setup /etc/resolv.conf that 'host cellserv1'
and reports the 126.X address and change the CellServDB to the 126.X ip's
and start the Server <br>
than starts correct , but only registers the 126.X ip in the VLDB .</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
so is it possible , that this is a bug in the OPENAFS AIX code ? because
i installed with the same setup a Linux Box and everything works fine .</font><font size=3>
</font><font size=2 face="sans-serif"><br>
I also reinstalled the whole AIX Box , so i think there is no Problem with
the installation itself .</font><font size=3> </font><font size=2 face="sans-serif"><br>
it also looks like /usr/afs/local/NetRestict is not used for IP exclusions
.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
i am running AIX 4.3.3 ML10 </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
Some Hints or Help would be Great ..</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
Sven</font><font size=3> </font>
<br>
--=_alternative 0058F452C1256CBD_=--