[OpenAFS] iperf vs rxperf in high latency network

Jeffrey E Altman jaltman@auristor.com
Thu, 8 Aug 2019 23:06:42 -0400


This is a cryptographically signed message in MIME format.

--------------ms050808010902020400060301
Content-Type: multipart/mixed;
 boundary="------------2248C29B0561B59EC9583EFC"
Content-Language: en-US

This is a multi-part message in MIME format.
--------------2248C29B0561B59EC9583EFC
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hi Simon,

response inline ...

On 8/8/2019 2:54 PM, xguan@reliancememory.com wrote:
> To make sure I captured all the explanations correctly, please allow me=
 to summarize my understandings:
>=20
> Flow control over a high-latency, potentially congested link is a funda=
mental challenge that both TCP and UDP+Rx face. Both protocol and impleme=
ntation can pose a problem. The reason why I did not see an improvement w=
hen enlarging the window size in rxperf is that firstly I chose too few d=
ata bytes to transfer and secondly that OpenAFS's Rx has some implementat=
ion limitations that become a limiting factor before the window size limi=
t kicks in. They are non-trivial to fix, as demonstrated in the 1.5.x thr=
oughput "hiccup". But AuriStor fixed a significant amount of it in its pr=
oprietary Rx re-implementation.=20
>=20
> One can borrow ideas and principals from algorithm research in TCP's fl=
ow control to improve Rx throughput. I am not an expert on this topic, bu=
t I wonder if the principals in Google's BBR algorithm can help further i=
mprove Rx throughput, and I wonder if there is anything that makes TCP fu=
ndamentally superior than UDP in implementing flow control.=20

There is nothing specific to TCP that makes it better than RX in
implementing flow control other than the fact that TCP has more than
thirty years of active research applied to it and RX does not.

AuriStor continues to invest in RX as we believe that RX can perform as
well as TCP while benefiting from its unique security binding
capabilities.  Reliance Memory's RRAM is targeted at IoT devices.  I
believe that RX is should be the network transport of choice for IoT.

One of the requirements for implementing BBR is fine grained accurate
measurements of RTT which is very hard to obtain from within a userland
implementation that relies upon an operating system's UDP sockets.
However, BBR principals can be applied to the Linux kernel's af_rxrpc
implementation and userland implementations built to use Intel's Data
Plane Development Kit (DPDK).  I would be happy to speak with you
off-list about either.

> When it comes to deployment strategy, there may be workarounds to the h=
igh-latency limitation. Each of them, of course, has limitations. I can p=
robably use the technique mentioned below to leverage the TCP throughput =
in RO volume synchronization,=20
> https://lists.openafs.org/pipermail/openafs-info/2018-August/042502.htm=
l
> and wait until DPF becomes available in vos operations:
> https://openafs-workshop.org/2019/schedule/faster-wan-volume-operations=
-with-dpf/

As part of AuriStor's SBIR we were funded to research RX/TCP and
implement it if appropriate.  The accepted theory was that RX/TCP would
permit RX based applications to benefit from all of the research and
implementation improvements that TCP benefited from over the decades.
However, we quickly discovered that an RX application that implemented
both RX/TCP and current day RX/UDP could not ensure fairness for the
RX/UDP connections.  The RX/TCP flows would dominate the network at the
expense of RX/UDP flows because RX/UDP could not properly adjust to
network congestion levels.

Some people argued "good riddance, let RX/UDP die" but the reality is
that RX/UDP is where the existing user base is and it was unacceptable
to me that one class of users should be penalized in favor of another.
In order to permit TCP flows to be mixed with RX/UDP flows fairly,
RX/UDP needed fixing; and once RX/UDP was fixed there was little
justification for RX/TCP.

The same fairness issues apply to Sine Nomine Associate's DPF and prior
Out-of-Band TCP proposals.

> I can also adopt a small home volume, distributed subfolder volume stra=
tegy that allows home volumes to move with relocated users across WAN, bu=
t keep subdirectory volumes at their respective geographic location. User=
s can pick a subdirectory that is closest to their current location to wo=
rk with. When combined with a version control system that uses TCP in syn=
cing, project data synching can be alleviated.=20

AuriStor has several ideas that would be beneficial to your deployment
scenarios:

 1. floating master read/write replication.

 2. split horizon volume location service

I would be happy to discuss both topics with you off-list.

> There is a commercial path that we can pursue with AuriStor or other ve=
ndors. But I guess that is out of the scope of this mail list.=20
>=20
> Any other strategies that may help?
>=20
> Thank you, Jeff!

You are welcome.

> Simon Guan

Jeffrey Altman


--------------2248C29B0561B59EC9583EFC
Content-Type: text/x-vcard; charset=utf-8;
 name="jaltman.vcf"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
 filename="jaltman.vcf"

begin:vcard
fn:Jeffrey Altman
n:Altman;Jeffrey
org:AuriStor, Inc.
adr:;;255 W 94TH ST STE 6B;New York;NY;10025-6985;United States
email;internet:jaltman@auristor.com
title:CEO
tel;work:+1-212-769-9018
url:https://www.linkedin.com/in/jeffreyaltman/
version:2.1
end:vcard


--------------2248C29B0561B59EC9583EFC--

--------------ms050808010902020400060301
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC
DGswggXSMIIEuqADAgECAhBAAWbTGehnfUuu91hYwM5DMA0GCSqGSIb3DQEBCwUAMDoxCzAJ
BgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQgQ0EgQTEy
MB4XDTE4MTEwMjA2MjYyMloXDTE5MTEwMjA2MjYyMlowcDEvMC0GCgmSJomT8ixkAQETH0Ew
MTQyN0UwMDAwMDE2NkQzMTlFODFBMDAwMDdBN0IxGTAXBgNVBAMTEEplZmZyZXkgRSBBbHRt
YW4xFTATBgNVBAoTDEF1cmlTdG9yIEluYzELMAkGA1UEBhMCVVMwggEiMA0GCSqGSIb3DQEB
AQUAA4IBDwAwggEKAoIBAQDqEYwjLORE23Gc8m7YgKqbGzWn/fmVGtoZkBNwOEYlrFOu84Pb
EhV4sxQrChhPyXVW2jquV2rg2/5dsVC8RO+RwlXuAkUvR9KhWJLu6GJXwUnZr83wtEzJ8nqp
THj6W+3velLwWx7qhADyrMnKN0bTYh+5M9HWt2We4qYi6i1/ejgKtM0arWYxVx6Iwb4xZpil
MDNqV15Dwuunnkq4vNEByIT81zDoClqylMxxKJpvc3tqC66+BHHM5RxF+z36Pt8fb3Q54Vry
txXFm+kVSclKGaWgjq5SqV4tR0FWv6OnMY8tAx1YrljfvgxW5npZgBbo+YVoYEfUrz77WIYQ
yzn7AgMBAAGjggKcMIICmDAOBgNVHQ8BAf8EBAMCBPAwgYQGCCsGAQUFBwEBBHgwdjAwBggr
BgEFBQcwAYYkaHR0cDovL2NvbW1lcmNpYWwub2NzcC5pZGVudHJ1c3QuY29tMEIGCCsGAQUF
BzAChjZodHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL2NlcnRzL3RydXN0aWRjYWEx
Mi5wN2MwHwYDVR0jBBgwFoAUpHPa72k1inXMoBl7CDL4a4nkQuwwCQYDVR0TBAIwADCCASsG
A1UdIASCASIwggEeMIIBGgYLYIZIAYb5LwAGAgEwggEJMEoGCCsGAQUFBwIBFj5odHRwczov
L3NlY3VyZS5pZGVudHJ1c3QuY29tL2NlcnRpZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRt
bDCBugYIKwYBBQUHAgIwga0agapUaGlzIFRydXN0SUQgQ2VydGlmaWNhdGUgaGFzIGJlZW4g
aXNzdWVkIGluIGFjY29yZGFuY2Ugd2l0aCBJZGVuVHJ1c3QncyBUcnVzdElEIENlcnRpZmlj
YXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL3NlY3VyZS5pZGVudHJ1c3QuY29tL2NlcnRp
ZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRtbDBFBgNVHR8EPjA8MDqgOKA2hjRodHRwOi8v
dmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL2NybC90cnVzdGlkY2FhMTIuY3JsMB8GA1UdEQQY
MBaBFGphbHRtYW5AYXVyaXN0b3IuY29tMB0GA1UdDgQWBBQevV8IqWfIUNkQqAugGhxR938z
+jAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwDQYJKoZIhvcNAQELBQADggEBAKsU
kshF6tfL43itTIVy9vjYqqPErG9n8kX5FlRYbtIVlWIYTxQpeqtDpUPur1jfBiNY+xT+9Pay
O2+XxXu9ZEykCz5T4+3q7s5t5RLsHu1dxYcMnAgfUqb13mhZxY8PVPE4PTHSvZLjPZ6Nt7j0
tXjddZJqjDhr7neNpmYgQWSe+oaIxbUqQ34rVW/hDimv9Y2DnCXL0LopCfABQDK9HDzmsuXd
bVH6LUpS6ncge9kQEh1QIGuwqEv2tHCWeauWM6h3BOXj3dlfbJEawUYz2hvc3nSXpscFlCN5
tGAyUAE8QbKnH1ha/zZVrJY1EglFhnDho34lWl35t7pE5NP4kscwggaRMIIEeaADAgECAhEA
+d5Wf8lNDHdw+WAbUtoVOzANBgkqhkiG9w0BAQsFADBKMQswCQYDVQQGEwJVUzESMBAGA1UE
ChMJSWRlblRydXN0MScwJQYDVQQDEx5JZGVuVHJ1c3QgQ29tbWVyY2lhbCBSb290IENBIDEw
HhcNMTUwMjE4MjIyNTE5WhcNMjMwMjE4MjIyNTE5WjA6MQswCQYDVQQGEwJVUzESMBAGA1UE
ChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMjCCASIwDQYJKoZIhvcNAQEB
BQADggEPADCCAQoCggEBANGRTTzPCic0kq5L6ZrUJWt5LE/n6tbPXPhGt2Egv7plJMoEpvVJ
JDqGqDYymaAsd8Hn9ZMAuKUEFdlx5PgCkfu7jL5zgiMNnAFVD9PyrsuF+poqmlxhlQ06sFY2
hbhQkVVQ00KCNgUzKcBUIvjv04w+fhNPkwGW5M7Ae5K5OGFGwOoRck9GG6MUVKvTNkBw2/vN
MOd29VGVTtR0tjH5PS5yDXss48Yl1P4hDStO2L4wTsW2P37QGD27//XGN8K6amWB6F2XOgff
/PmlQjQOORT95PmLkwwvma5nj0AS0CVp8kv0K2RHV7GonllKpFDMT0CkxMQKwoj+tWEWJTiD
KSsCAwEAAaOCAoAwggJ8MIGJBggrBgEFBQcBAQR9MHswMAYIKwYBBQUHMAGGJGh0dHA6Ly9j
b21tZXJjaWFsLm9jc3AuaWRlbnRydXN0LmNvbTBHBggrBgEFBQcwAoY7aHR0cDovL3ZhbGlk
YXRpb24uaWRlbnRydXN0LmNvbS9yb290cy9jb21tZXJjaWFscm9vdGNhMS5wN2MwHwYDVR0j
BBgwFoAU7UQZwNPwBovupHu+QucmVMiONnYwDwYDVR0TAQH/BAUwAwEB/zCCASAGA1UdIASC
ARcwggETMIIBDwYEVR0gADCCAQUwggEBBggrBgEFBQcCAjCB9DBFFj5odHRwczovL3NlY3Vy
ZS5pZGVudHJ1c3QuY29tL2NlcnRpZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRtbDADAgEB
GoGqVGhpcyBUcnVzdElEIENlcnRpZmljYXRlIGhhcyBiZWVuIGlzc3VlZCBpbiBhY2NvcmRh
bmNlIHdpdGggSWRlblRydXN0J3MgVHJ1c3RJRCBDZXJ0aWZpY2F0ZSBQb2xpY3kgZm91bmQg
YXQgaHR0cHM6Ly9zZWN1cmUuaWRlbnRydXN0LmNvbS9jZXJ0aWZpY2F0ZXMvcG9saWN5L3Rz
L2luZGV4Lmh0bWwwSgYDVR0fBEMwQTA/oD2gO4Y5aHR0cDovL3ZhbGlkYXRpb24uaWRlbnRy
dXN0LmNvbS9jcmwvY29tbWVyY2lhbHJvb3RjYTEuY3JsMB0GA1UdJQQWMBQGCCsGAQUFBwMC
BggrBgEFBQcDBDAOBgNVHQ8BAf8EBAMCAYYwHQYDVR0OBBYEFKRz2u9pNYp1zKAZewgy+GuJ
5ELsMA0GCSqGSIb3DQEBCwUAA4ICAQAN4YKu0vv062MZfg+xMSNUXYKvHwvZIk+6H1pUmivy
DI4I6A3wWzxlr83ZJm0oGIF6PBsbgKJ/fhyyIzb+vAYFJmyI8I/0mGlc+nIQNuV2XY8cypPo
VJKgpnzp/7cECXkX8R4NyPtEn8KecbNdGBdEaG4a7AkZ3ujlJofZqYdHxN29tZPdDlZ8fR36
/mAFeCEq0wOtOOc0Eyhs29+9MIZYjyxaPoTS+l8xLcuYX3RWlirRyH6RPfeAi5kySOEhG1qu
NHe06QIwpigjyFT6v/vRqoIBr7WpDOSt1VzXPVbSj1PcWBgkwyGKHlQUOuSbHbHcjOD8w8wH
SDbL+L2he8hNN54doy1e1wJHKmnfb0uBAeISoxRbJnMMWvgAlH5FVrQWlgajeH/6NbYbBSRx
ALuEOqEQepmJM6qz4oD2sxdq4GMN5adAdYEswkY/o0bRKyFXTD3mdqeRXce0jYQbWm7oapqS
ZBccFvUgYOrB78tB6c1bxIgaQKRShtWR1zMM0JfqUfD9u8Fg7G5SVO0IG/GcxkSvZeRjhYcb
TfqF2eAgprpyzLWmdr0mou3bv1Sq4OuBhmTQCnqxAXr4yVTRYHkp5lCvRgeJAme1OTVpVPth
/O7HJ7VuEP9GOr6kCXCXmjB4P3UJ2oU0NqfoQdcSSSt9hliALnExTEjii20B2nSDojGCAxQw
ggMQAgEBME4wOjELMAkGA1UEBhMCVVMxEjAQBgNVBAoTCUlkZW5UcnVzdDEXMBUGA1UEAxMO
VHJ1c3RJRCBDQSBBMTICEEABZtMZ6Gd9S673WFjAzkMwDQYJYIZIAWUDBAIBBQCgggGXMBgG
CSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTE5MDgwOTAzMDY0NVow
LwYJKoZIhvcNAQkEMSIEIGYHtWu5HJSz3jMr3fIetdkCZrtuQS3PXE6NagrxjcW5MF0GCSsG
AQQBgjcQBDFQME4wOjELMAkGA1UEBhMCVVMxEjAQBgNVBAoTCUlkZW5UcnVzdDEXMBUGA1UE
AxMOVHJ1c3RJRCBDQSBBMTICEEABZtMZ6Gd9S673WFjAzkMwXwYLKoZIhvcNAQkQAgsxUKBO
MDoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQg
Q0EgQTEyAhBAAWbTGehnfUuu91hYwM5DMGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEq
MAsGCWCGSAFlAwQBAjAKBggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwIC
AUAwBwYFKw4DAgcwDQYIKoZIhvcNAwICASgwDQYJKoZIhvcNAQEBBQAEggEAWa21qV21jbOO
AJJlxaq3JEEWUeC+wg+dGX40LrcZ5E2V6TEdU4YHLM0VsACEVTs2pvjmHx17bb39mAqwIhWC
uV2XcQgnjvYiFb0st16zqVS0PG3NR4olwk0VdyMshJEfT/zezzLlzxcnRJmF+7iGNh0QdFWD
2uRqPJJqe0v6hE8zF6m5eCYCYqQsoxWZ1lj1fG+LjfJmENvAjsC8h4+VMGETv7CNwtcJePSJ
0HgnNj8eyrmbaP7blKfNmsvIl3fbYW8E5a/GCGZeJsEBvGKzTrkc9kUNyLJ8ukVQrvk2GH8c
MGogJb6wZVHNOPwMavidztRmh/gSwjCM0LmIXR0IaAAAAAAAAA==
--------------ms050808010902020400060301--