[OpenAFS] BIG Problem Setup AFS Multihomed Server

Hartmut Reuter reuter@rzg.mpg.de
Tue, 04 Feb 2003 09:15:16 +0100


Have a look into rx/rx_getaddr.c. I submitted a patch for AIX 5.1 some 
time ago which probably also should be used for AIX 4.3. The main thing 
was that you have to keep the pointer to the next interface's data 
before calling ioctl() the next time because ioctl overwrites the fields 
needed in the calculation of the offset for the next entry.

I am, however,  not sure whether this is your problem but at least you 
may check it.

Hartmut

Sven Oehme wrote:
> 
> is nobody there who can help with this Problem ?
> 
> i only need somebody telling me which module i should look into  ...
> 
> Sven
> 
> E-Mail : oehmes@de.ibm.com
> 
> 
> 
> 
> *Sven Oehme/Germany/IBM@IBMDE*
> Sent by: openafs-info-admin@openafs.org
> 
> 01/29/2003 05:11 PM
> 
>        
>         To:        "Broughton, Travis V" <tvb@intel.com>, 
> openafs-info@openafs.org
>         cc:        
>         Subject:        RE: [OpenAFS] BIG Problem Setup AFS Multihomed 
> Server
> 
>        
> 
> 
> 
> 
> hy ,
> 
> yes i tried that already , it worked , but after the reboot it hasn't 
> worked any more , but i have no clue why !
> 
> is this an AIX Bug , or an Openafs bug ?
> 
> under AIX 5.1 with Transarc AFS 3.6 PTF 5 i have the same Setup and it 
> is working , so my assumption is , this is a code defect in OpenAFS ,
> is anybody able to look at it , or at least tell me the code Part i 
> should check.
> 
> Thanks . Sven
> 
> 
> 
> *"Broughton, Travis V" <tvb@intel.com>*
> 
> 01/29/2003 03:59 PM
> 
>        
>        To:        Sven Oehme/Germany/IBM@IBMDE
>        cc:        
>        Subject:        RE: [OpenAFS] BIG Problem Setup AFS Multihomed 
> Server
> 
>      
> 
> 
> 
> 
> Have you tried swapping the order of the interfaces on the NICs?  That 
> is, put 192.x on en3 and 126.x on en4?  I seem to recall an issue a 
> while back where AIX really wanted to report the first interface it 
> found - we found that swapping interfaces helped resolve an issue.  My 
> brain's fuzzy on whether this was exactly the same situation as yours, 
> but it may be worth trying.
>  
> -tvb
> -----Original Message-----*
> From:* Sven Oehme [mailto:oehmes@de.ibm.com] *
> Sent:* Wednesday, January 29, 2003 3:00 AM*
> To:* openafs-info@openafs.org*
> Subject:* [OpenAFS] BIG Problem Setup AFS Multihomed Server
> 
> 
> Hy,
> 
> i am working now since a week on a Problem setting up a Multihomed file 
> and Database Server and become crazy ..
> 
> my Scenario is the following :
> 
> 
> 
> 
>    |-----| 126.X |---------|192.X |-----| 192.X |------------|  9.X 
>  |-----|  
>    |     | ----- |CellServ1| ---- |     | ----- |AFS Client 1|       
>  ---- |     |
>    | B   |       |---------|      |  A  |       |------------|       | 
>  P  |
>    | A   |                        |  F  |                            | 
>  L  |  9.X         |-------------|
>    | C   | 126.X |---------|192.X |  S  | 192.X |------------|  9.X  | 
>  A  |  ------------|Samba clients|
>    | K   | ----- |CellServ1| ---- |     | ----- |AFS Client 1|       
>  ---- |  N  |              |-------------|
>    | U   |       |---------|      |  L  |       |------------|       | 
>  T  |
>    | P   |                        |  A  |                            |   
>   |
>    |     |                        |  N  |                            | 
>  L  |
>    | L   | 126.X |---------|192.X |     |                            | 
>  A  |
>    | A   | ----- |FileServ1| ---- |     |                            | 
>  N  |    
>   | N   |       |---------|      |     |                            |   
>   |    
>    |     |                        |     |                            |   
>   |    
>    |126.X|                        |192.X|                            | 
> 9.X |
>    |-----|                        |-----|                           
>  |-----|    
> 
> 
> 
> so for Explanation ,
> 
> i like to setup the CellServers and Fileserver connected to the 126.X 
> and the 192.X Lan (but only use the 192.X Range for AFS)
> the Clients read the Data from the FileServer and share the Data with 
> Samba to the Plant Lan .
> 
> The CellServDB look like this for all Clients and Servers in 
> /usr/afs/etc/ and /usr/vice/etc:
> 
>  >testme.org                #test Cell
> 192.168.180.101        #Cellserv1.afslan.com
> 192.168.180.102        #Cellserv2.afslan.com
> 
> my /etc/hosts file is empty (only the 127.0.0.1 with localhost) , but 
> forward and reverse
> look-up's are possible and correct for 126.x and 192.x network trough DNS
> 
> so 'host cellserv1' returns 192.168.180.101, also 'host 
> cellserv1.afslan.com' is pointing to 192.168.180.101
> and reverse is also working 'host 192.168.180.101' is pointing to 
> cellserv1.afslan.com
> 
> so name resolution is not a Problem (and i also tried to put everything 
> in the /etc/hosts with the same result ).
> 
> the Interface configuration of cellserv1 for examples is  :
> 
> en0 empty (not used and down)
> en1 empty (not used and down)
> en2 has ip 126.201.100.241 subnetmask 255.255.0.0 and gateway 126.201.100.9
> en3 has ip 192.168.180.11 subnetmask 255.255.0.0 and no gateway
> 
> so , when i now try to start only this one host  i get a lot off ubik 
> errors .
> 
> looking in the logs shows me , that 192.168.180.11 is not the primary ip 
> address
> buserver , Fileserver , salvager ...  are dieing every 10 -15 sec. and 
> restart
> 
> when is bos shutdown the server remove the 192.X ip , change the entry 
> in the Cellservdb to 126.X bos startup the Server , everything is 
> working , but with the 126.X  ip.
> then i tried to use /usr/afs/local/NetRestict  and 
> /usr/afs/local/NetInfo  , same result , with the 192.X configured
> 
> cat /usr/afs/local/NetRestrict  reports 126.201.100.241  and cat 
> /usr/afs/local/NetInfo reports 192.168.180.11
> 
> no change , he only reports me every time 192.X is not the primary 
> address .
> 
> when i remove the 126.X ip everything is back working .
> 
> it is also working , when i setup /etc/resolv.conf that 'host cellserv1' 
> and reports the 126.X address and change the CellServDB to the 126.X 
> ip's and start the Server
> than starts correct , but only registers the 126.X ip in the VLDB .
> 
> so is it possible , that this is a bug in the OPENAFS AIX code ? because 
> i installed with the same setup a Linux Box and everything works fine .
> I also reinstalled the whole AIX Box , so i think there is no Problem 
> with the installation itself .
> it also looks like /usr/afs/local/NetRestict is not used for IP 
> exclusions .
> 
> i am running AIX 4.3.3 ML10
> 
> Some Hints or Help would be Great ..
> 
> Sven


-- 
-----------------------------------------------------------------
Hartmut Reuter                           e-mail reuter@rzg.mpg.de
					   phone +49-89-3299-1328
RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------