[OpenAFS] Re: Server disk operations speed

Andrew Deason adeason@sinenomine.net
Thu, 18 Apr 2013 18:19:55 -0500


On Thu, 18 Apr 2013 19:58:46 +0300 (EEST)
"Jukka Tuominen" <jukka.tuominen@finndesign.fi> wrote:

> Due to confidental content, me and Andrew have been troubleshooting
> the slow system off-line for awhile. To be fair, Andrew was the brain
> and I was the typist. By using Wireshark mostly, it become evident
> that the server was spending lots of time speaking to itself. That is,
> because the system is in a DMZ, it has both a private and a public IP.
> During file transfers, the private IP sent messages to the public one
> and vice versa, both being the very same machine.

To provide a little more detail... the vldb had separate server entries
for the public IP and the private IP, presumably due to a 'vos
changeaddr' mucking up the db at some point in the past. The fileserver
is behind a NAT, so it only submits the private IP to the vldb.  The
actual relevant vldb entries were pointed at the public IP, so any
communication to the local fileserver had to leave the local machine and
traverse the NAT, incurring a delay of a few milliseconds. And once you
assume a few milliseconds of latency, the numbers that Jukka provided
seem much more expected.

> Next, I'll try to figure out how to make the server and DMZ play
> together nicely, and sustain the higher speed in WAN, too. If anybody
> has a clear vision of how to do that, I'd appreciate it. Anyhow, I
> will report back if/when I find the solution.

To be clear, I didn't think you were going to get _that_ much faster
rates from the WAN, if the latency is at least as high as it was when
the local machine was pointed at the public IP. I thought what you were
after at this point was getting the 'fast' speeds inside the LAN, while
also making the fileserver work for WAN clients.

So, I'm not sure if/how anyone else on the list may deal with this; I
myself don't really deal with fileservers behind NATs. The issue is that
the fileserver advertises the same IPs to all clients. So, you can
advertise either just the public IP, just the private IP, or both the
public and private IP, to everyone. (These are controlled by NetInfo and
NetRestrict.)

If you advertise just the private IP, obviously clients on the WAN
cannot reach the fileserver. If you advertise both the public and
private IPs, then WAN clients may try the private IP first, fail, and
then try the public IP. This can cause unpleasant delays, and could also
cause WAN clients to contact the wrong machine when trying to reach the
fileserver.

Advertising only the public IP is the "correct" way, but as we have
seen, it can make things slower. One way I've suggested to make that
better is to maybe use an iptables DNAT rule on LAN machines to rewrite
outgoing requests from the 'public' fileserver IP to the private one.
I'm not sure if anybody else is doing anything like that, or if they've
found any other way around this.

-- 
Andrew Deason
adeason@sinenomine.net