[OpenAFS] Multi-homed server and NAT-ed client issues

Ciprian Dorin Craciun ciprian.craciun@gmail.com
Wed, 17 Jul 2013 21:28:09 +0300


    Problem solved!  Thanks to both posters for pointing me to the
right direction: adding the `-rxbind` option to the following
services:  `kaserver`, `ptserver`, `vlserver`, `fileserver`, and
`volserver`.  This was simply done by editing the `BosConfig` file in
the `/etc/openafs` folder and adding that token in the lines starting
with `parm`.

    (I must confess I feel quite dump for not finding this option
myself...  I must say I did add it to the `bosserver` invocation but
didn't seem to work.  I should have added it to each individual
service.)

    However a note for the documentation maintainers it seems that the
`-rxbind` option is missing from the manuals of the following services
(at least in the HTML version on http://docs.openafs.org ): `kaserver`
and `volserver`.


    About the words that follow bellow, because they were written
while I was reading the replies, I'll leave them there, for those that
one day will have similar issues with multi-homed servers.


On Wed, Jul 17, 2013 at 6:28 PM, Jeffrey Hutzelman <jhutz@cmu.edu> wrote:
> On Wed, 2013-07-17 at 17:43 +0300, Ciprian Dorin Craciun wrote:
>>     Hello all!  I've encountered quite a blocking issue in my OpenAFS
>> setup...  I hope someone is able to help me... :)
>>
>>
>>     The setup is as follows:
>>     * multi-homed server with, say S-IP-1 (i.e. x.x.x.5) and S-IP-2
>> (i.e. x.x.x.7), multiple IP addresses, all from the public range;
>
> Things get much easier if you just use the actual names and addresses,
> instead of making up placeholders.

    Indeed it could seem that I've obscured the situation by providing
placeholders for the actual IP's.  (The reason revolves mainly around
the fact that all these emails are public knowledge.)

    However the values I've chosen as placeholders have been carefully selected:
    * both are addresses for the same interface (configured with `ip
addr add ...`, thus not with alias interfaces like Debian once had);
    * both are addresses from the same network (thus are routed identically);
    * the second IP (the one OpenAFS should use) is marked as
`secondary` by `ip addr show`;

    Below is the output of `ip -4 addr show ethX` (blanking only the
interface name and the network address):
~~~~
ethX: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    inet x.x.x.5/27 brd x.x.x.31 scope global ethX
    inet x.x.x.7/27 brd x.x.x.31 scope global secondary ethX
~~~~

    And the output of `ip -4 route show`:
~~~~
x.x.x.0/27 dev ethX  proto kernel  scope link  src x.x.x.5
127.0.0.0/8 via 127.0.0.1 dev lo
default via x.x.x.1 dev ethX
~~~~

    The full output of both `ip addr` and `ip route` includes a few
more bridges and interfaces.  However none share the same IP range
with the addresses above, there are no other default routes except the
one above, and moreover the OpenAFS clients aren't on any of the
"extra" networks (i.e. the packets to them should go through the
default route above).


> Frequently, doing that sort of thing
> hides critical information that may point to the source of the problem.

    I hope that the details above are sufficient to depict the overall context.


> For example, in this case, Linux's choice of source IP address on an
> outgoing UDP packet sent from an unbound socket (or one bound to
> INADDR_ANY) will depend on the interface it chooses, which will depend
> on the route taken, which depends on the server's actual addresses and
> the network topology, particularly with respect to the client (or in
> this case, to the public address of the NAT the client is behind).

    As said, the reply packet leaves the server with the source set as
the first IP (x.x.x.5).  And thus the behaviour is consistent with a
socket bound to `INADDR_ANY` and towards a peer that takes the default
route.


> You also haven't said what version of OpenAFS you're using, so I'll
> assume it's some relatively recent 1.6.x.

    Indeed, my fault.  Being hurried to leave for home, I've forgot to
mention that I have Linux on both the client and server, and the
OpenAFS version is 1.6.2.


>>     * the second IP, S-IP-2 (i.e. x.x.x.7), is the one listed in
>> `NetInfo` and DNS record (and correctly listed when queried via `vos
>> listaddrs`);
>>     * the first IP, S-IP-1 (i.e. x.x.x.5), is listed in
>> `NetRestricted` (and doesn't appear in `vos listaddrs`);
>
> So, the machine the fileserver runs on is multi-homed, but you're only
> interested in actually using one of those interfaces to provide AFS
> service?

    Exactly the server is multi-homed and I want it to use the
secondary IP address.  (In fact all OpenAFS services run on exactly
the same server.)


> In that case, you use the -rxbind option, which tells the
> servers to bind to a specific address instead of INADDR_ANY.  That
> option needs to be passed to each server process for which you want that
> behavior.

    Indeed this solves the issue.  Maybe this could be written in a
FAQ entry or in a page dedicated to multi-homed servers (there is one
only for multi-homed clients).


> Besides -rxbind, there are a couple of other options, depending on which
> components you control.  For example, if the NAT is your home router and
> you only have one or two AFS clients behind it, you can assign those
> clients static addresses on your inside network, and then configure your
> router to remap the client-side addresses on both inbound and outbound
> traffic, mapping each inside host's port 7001 to a different outside
> port.  For example, my router (running OpenWRT) installs the following
> rules:
>
> ### Static ports for AFS
> for i in `seq 50 249` ; do
>   iptables -t nat -A prerouting_wan  -p udp --dport $((7000+$i)) \
>     -j DNAT --to 192.168.202.${i}:7001
>   iptables -t nat -A postrouting_rule -o $WAN -p udp -s 192.168.202.$i
>     --sport 7001 -j MASQUERADE --to-ports $((7000+$i))
> done
> iptables -A forwarding_wan -p udp --dport 7001 -j ACCEPT
>
> (in OpenWRT's default configuration, the 'forwarding_wan' and
> 'prerouting_wan' chains get called from the FORWARD and nat PREROUTING
> chains, respectively, for traffic originating from the internet.  The
> 'postrouting_rule' chain gets called from the nat POSTROUTING chain for
> all traffic).
>
> So, when 192.168.202.142 sends traffic to a fileserver from port 7001,
> it comes from the routers port 7142.  And inbound traffic to that port
> gets sent back to 192.168.202.142 port 7001, regardless of where on the
> Internet it came from or whether the router knows about the connection.
> As you can see, I do this for a range of 200 addresses, which are the
> ones my DHCP server hands out -- anyone who visits my house gets working
> AFS, without keepalives, and even when talking to a multihomed server.


    Interesting solution, I wouldn't have thought of that.  :)

    And only now I've seen that there is a single inbound port
involved on the client side, according to:

      http://wiki.openafs.org/AdminFAQ/#ports

    However as you mention it can be scripted only if you have "shell"
access to your router, and unfortunately mine doesn't run an "open"
router distribution.  (It could but DD-WRT has issues with the
hardware and it's instable, thus I've reverted to the Linksys
firmware...)

    On the other side, if one can't touch the router directly but only
through crippled UI's, I think all that can be achieved by using
either:
    * using the "DMZ" (potential security hole for the client? as
it'll have to manage its own firewall;)
    * a simple "Single Port Forwarding" of the 7001 for exactly one client;
    * multiple "Single Port Forwarding" for a few clients as you
describe above, but also to force the clients themselves (probably via
`iptables -j SNAT`) to use the new range of port when sending;


>>     I must say I've tried to `iptables -j SNAT ...` outgoing packets
>> to the right S-IP-2, however this doesn't work because SNAT also
>> changes the source port.  I've also tried to `-j NETMAP` these
>> packets, but it doesn't work because NETMAP in the `OUTPUT` or
>> `POSTROUTING` tables actually touch the destination...  Thus if
>> someone knows of an `iptables`...
>
> Well, you can give SNAT a specific port to use.

    Doesn't work...  If you give it a port range all it does is use
that port range to remap the "original" port, thus it will always
change it...


> Or, you can play games
> with routing tables to give AFS traffic a routing table that doesn't
> include the second interface.  But that's functionally equivalent to
> using -rxbind but a lot more work.

    Yes, my colleague gave me this solution including policy routing,
but it is too complicated.


    Thanks all for the support,
    Ciprian.