[OpenAFS] Multihomed issues

Russ Allbery rra@stanford.edu
Mon, 17 Jan 2011 12:40:55 -0800

Jaap Winius <jwinius@umrk.nl> writes:

> After messing around with a couple of multihomed hosts for a while, I
> get the impression that AFS, or at least the Debian squeeze install
> procedure for it, doesn't like to work this way. It's possible to
> prevent the file server from listening to certain interfaces (addresses)
> by using NetInfo and NetRestrict, but I wish I knew how to do something
> similar for the ptserver and especially for the vlserver.

The file server is what tells the VLDB that it has those addresses, so I
think the same solution should work there.  The trick is that you have to
create the NetInfo and NetRestrict files before the first time you start
the file server.  It should then not register them.

> When setting up a new database server (which is supposed to replicate
> via the Internet), it seems that, by default, AFS scans all of the
> interfaces on a new host to end up using the IP addresses associated
> with them.

The same NetInfo and NetRestrict files that work for file servers also
work for the vlserver and ptserver, although in those cases only Ubik
should care; everything else uses either AFSDB/SRV DNS information or the
addresses in CellServDB.

> In my case, this includes several private addresses that I don't want
> any of the database servers to use. However, even if immediately
> afterwards I remove these addresses from a new server's CellServDB and
> restart it, it's too late: they're already in the VLDB and AFS is
> already trying to send a new RO copy of root.cell to the new
> server... using that server's private range IP address.

I don't think you meant CellServDB here, or if you did, then something
else is going on that I don't understand.  CellServDB shouldn't ever
contain the IP addresses of file servers (unless your VLDB servers are
also file servers).

> Now I've got:

> # vos volinfo root.cell -noresolv
> root.cell                         536870915 RW          6 K  On-line
>     oost.dapadam.nl /vicepa
>     RWrite  536870915 ROnly  536870916 Backup          0
>     MaxQuota       5000 K
>     Creation    Sun Jan  9 15:32:11 2011
>     Copy        Sun Jan  9 15:32:11 2011
>     Backup      Never
>     Last Access Fri Jan 14 18:17:02 2011
>     Last Update Fri Jan 14 18:16:57 2011
>     0 accesses in the past day (i.e., vnode references)

>     RWrite: 536870915     ROnly: 536870916     RClone: 536870916
>     number of sites -> 3
>        server partition /vicepa RW Site  -- New release
>        server partition /vicepa RO Site  -- New release
>        server partition /vicepa RO Site  -- Old release

> It seems to be stuck like this. Can it be fixed?

> Before discovering this, I had already used "vos delentry" to remove the
> private IP addresses from the VLDB, so "vos listaddrs" currently shows
> only the correct (public) IP addresses.

vos setaddr, you mean?  vos delentry is what will fix the above; you need
to delentry the replication site on, vos zap the RO off the
server if necessary, and then restart the file server with proper NetInfo
and NetRestrict files so that it registers the correct IP addresses with
the VLDB.  Then everything should work properly.

Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>