[OpenAFS] Request for testing: NATs and 1.6.6pre*

Jukka Tuominen jukka.tuominen@finndesign.fi
Fri, 20 Dec 2013 07:29:38 +0200 (EET)


Hi Andrew,

I'm running the latest stable fileserver released over

ppa.launchpad.net/openafs/stable/ubuntu

which is currently 1.6.5.1 (OS: Ubuntu 10.04 LTS)

I use a client image that works behind NAT, both inside LAN and over WAN.
But it does use the latest OpenAFS packages provided by the above
mentioned ppa server, too.

Even though things work nicely usability-wise (just boot and log-in
graphically), I still think it should have a bit smoother two-way data
transfer behind the scene. Applications like Firefox like to write
constantly something to a homedir which happens to be on a server. This
sometimes freezes the application momentarily, even though the amount of
data transferred is still modest.

If you think this is the kind of configuration you're interested, and you
can provide a patch file that works on top of this, I could try to test it
during the weekend.


br, jukka

BTW, if anybody's interested in knowing what OpenAFS is used for here,
please see www.liitin.org


> Hi all,
>
> 1.6.6pre1 and 1.6.6pre2 contain an extra feature in the OpenAFS
> fileserver that could possibly help with communicating with clients
> behind NATs (Network Address Translation). It's not completely certain
> how much this feature helps, though, so it will be removed from the
> 1.6.6 release unless we get some more information about it.
>
> If you are running a fileserver that you believe may have some trouble
> talking to clients behind NATs, testing this feature would be very
> helpful. This is most relevant for any site that may have fileservers
> that are talking to NAT'ed clients, where the clients are old enough to
> not have the client-side NAT improvements (pre-1.6); this is most common
> at sites that have users accessing AFS from home that don't know much
> about AFS. You can test this new feature by just running a fileserver
> with 1.6.6pre* and see if anything improves; there is no additional
> configuration or anything to do.
>
> But how do you know if this is a problem for you at all? Usually the
> most user-visible symptom is that access to AFS hangs while a client is
> tryign to write to AFS, but a lot of different things can cause that.
>
> To know if that is being caused _specifically_ because of problems
> reaching clients behind NATs, you can check the fileserver's FileLog. In
> there, if you see a lot of log messages talking about errors trying to
> contact specific IPs and port numbers, you may be suffering from this.
>
> In particular, it's somewhat likely to be related to NATs if you see a
> lot of such error messages logged referring to non-7001 ports. And it's
> especially likely if you see a lot of connection errors for non-7001
> ports that are obviously incrementing over time. (For example, you see
> an error for port 8005, then 8006, then 8007, etc, all from the same
> IP.)
>
> It can also help to know if the IPs you see logged in FileLog are behind
> NATs in the first place. If you have no way of knowing that, you can
> sort-of detect what hosts may be behind NATs by sending the fileserver
> the SIGXCPU signal, and looking at the resulting
> /usr/afs/local/hosts.dump file. If you see an entry for a host with a
> public IP like "ip:203.0.113.40", and later on in that entry you see a
> list of IPs that include private IPs, like "[ 203.0.113.40:7001
> 192.168.1.5:7001]", that host may be behind a NAT.
>
> "Detecting" a client behind a NAT in this way is far from perfect, but
> it's just another things to check. Common private IP ranges are of
> course 192.168/16, 172.16/20, and 10/8. A client can obviously be behind
> a NAT without an IP in any of those ranges, but those are commonly used
> by consumer-grade home routers and stuff like that.
>
>
> Anyway, if you ever look into why an OpenAFS fileserver appears to be
> slow/hanging, and the above information suggests that client NATs are an
> issue, it would be very helpful if you tried looking into some posible
> fixes. If you cannot deploy 1.6.6pre* on a server experiencing this
> issue, we can also provide patches specifically for this issue based on
> a previous stable version, if that's more feasible. There are also
> additional possible patches in this area that are not in 1.6.6pre*, if
> you want to try other approaches.
>
> Or even if you can't actually deploy any testing code, I'd still like to
> hear from you if you think you are experiencing issues in this area.
> More information is always appreciated. Remember that if we don't hear
> anything, this will be pulled out.
>
>
> For developers: obviously I'm skipping over the details of what any of
> this actually does. The 'extra feature' is gerrit 9420, which will be
> reverted via gerrit 10135. See also: gerrits 10144-10147.
>
> --
> Andrew Deason
> adeason@sinenomine.net
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>