[OpenAFS] performance and udp buffers

Dan Van Der Ster daniel.vanderster@cern.ch
Mon, 19 Nov 2012 08:13:51 +0000


On Nov 18, 2012, at 10:34 PM, Simon Wilkinson <sxw@your-file-system.com>
 wrote:

>=20
> On 9 Oct 2012, at 10:24, Dan Van Der Ster wrote:
>> We currently run fileservers with udpsize=3D2MB, and at that size we hav=
e a 30 client limit in our test environment. With a buffer size=3D8MB (incr=
eased kernel max with sysctl and fileserver option), we don't see any dropp=
ed UDP packets during our client-reading stress test, but still get some dr=
opped packets if all clients write to the server. With a 16MB buffer we don=
't see any dropped packets at all in reading or writing.
>=20
> This was discussed in Edinburgh as part of the CERN site report (which I'=
d recommend to anyone interested in AFS server performance),=20
...
> Converting that number of packets into a buffer size is a bit of a dark a=
rt.

One of our colleagues pointed out an error in our slides showing how to inc=
rease the max buffer size to 16MBytes with sysctl. We had published this re=
cipe:

    sysctl -w net.core.rmem_max=3D16777216
    sysctl -w net.core.wmem_max=3D16777216
    sysctl -w net.core.rmem_default=3D65536
    sysctl -w net.core.wmem_default=3D65536
    sysctl -w net.ipv4.tcp_rmem=3D4096 87380 16777216
    sysctl -w net.ipv4.tcp_wmem=3D4096 65536 16777216
    sysctl -w net.ipv4.tcp_mem=3D16777216 16777216 16777216
    sysctl -w net.ipv4.udp_mem=3D16777216 16777216 16777216
    sysctl -w net.ipv4.udp_rmem_min=3D65536
    sysctl -w net.ipv4.udp_wmem_min=3D65536
    sysctl -w net.ipv4.route.flush=3D1

The problem is that net.ipv4.tcp_mem and net.ipv4.udp_mem are (a) system wi=
de total values for all buffers and (b) must be written as a number of 4k p=
ages, not bytes. In fact the default values on most systems for net.ipv4.tc=
p_mem and net.ipv4.udp_mem should already be large enough for 16MByte buffe=
rs (the default is tuned to be ~75% of the system's memory). So, the key sy=
sctl to set to enable large receive buffers is net.core.rmem_max.

> So, setting a UDP buffer of 8Mbytes from user space is _just_ enough to h=
andle 4096 incoming RX packets on a standard ethernet. However, it doesn't =
give you enough overhead to handle pings and other management packets. 16Mb=
ytes should be plenty providing that you don't

We use 256 server threads and found experimentally in our environment that =
to achieve zero packet loss we need around 12MBytes buffers. So we went wit=
h 16MB to give a little extra headroom.

Thanks for following up on this thread.
--
Dan van der Ster
CERN IT-DSS=