[OpenAFS-devel] Andrew Deason's OpenAFS RX performance patches
John P Janosik
jpjanosi@us.ibm.com
Sun, 9 May 2021 19:46:06 -0500
--=_alternative 000437CC862586D1_=
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="US-ASCII"
Jeffrey E Altman <jaltman@auristor.com> wrote on 05/07/2021 04:44:24 PM:
> John,
>=20
> What are your observations of how this code behaves on congested links.=20
> I expect that the sorting of received packets distorts the ACK clock=20
> and packet skew measurements reducing the ability to accurately measure=20
> the congestion window. Processing ACK packets in bulk is likely to=20
> produce a bursty transmission pattern which can result in overflowing=20
> the link capacity. As a result, fairness is reduced and packet loss=20
> might be increased.
>=20
> Jeffrey Altman
> AuriStor, Inc.
>=20
With our server hardware and the grid environment used in testing I was=20
never able to get over about 7Gb/s out of the 10Gb/s connection to the=20
servers and wasn't seeing any packet loss/rx retransmits. I know Andrew=20
reported more than that was possible in the presentation regarding these=20
patches but I didn't have time to debug why our setup wasn't matching=20
those results. My impression was that some other sites might be running=20
these patches in production. Can anyone comment if that is the case and=20
if they are able to saturate links and have the problem described?
> On 5/6/2021 10:22 PM, John P Janosik (jpjanosi@us.ibm.com) wrote:
> > Hi Ben,
> >=20
> > We have been importing these patches into our IBM internal OpenAFS=20
1.8.X=20
> > builds for over a year and have had our busiest cells running these=20
> > versions since fall last year. We hit some deadlock issue early on=20
but=20
> > that was fixed and I believe those patches made it to gerrit as well.
> >=20
> > I did the work to get the patches to apply to the versions of OpenAFS=20
we=20
> > are running, but I don't feel confident calling it a review. I missed =
> > the deadlock issue until we actually put it into production :).
> >=20
> > John Janosik
> > jpjanosi@us.ibm.com
>=20
> [attachment "jaltman.vcf" deleted by John P Janosik/Rochester/IBM]=20
> [attachment "OpenPGP=5Fsignature" deleted by John P Janosik/Rochester/IBM=
]=20
--=_alternative 000437CC862586D1_=
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="US-ASCII"
<tt><span style=3D" font-size:10pt">Jeffrey E Altman <jaltman@auristor.c=
om>
wrote on 05/07/2021 04:44:24 PM:<br>> John,<br>> <br>> What are yo=
ur observations of how this code behaves on congested links.
<br>> I expect that the sorting of received packets distorts the
ACK clock <br>> and packet skew measurements reducing the ability to acc=
urately measure
<br>> the congestion window. Processing ACK packets in bulk is lik=
ely
to <br>> produce a bursty transmission pattern which can result in overf=
lowing
<br>> the link capacity. As a result, fairness is reduced and pack=
et
loss <br>> might be increased.<br>> <br>> Jeffrey Altman<br>> A=
uriStor, Inc.<br>> </span></tt><br><br><span style=3D" font-size:10pt;fo=
nt-family:sans-serif">With our server
hardware and the grid environment used in testing I was never able to get
over about 7Gb/s out of the 10Gb/s connection to the servers and wasn't
seeing any packet loss/rx retransmits. I know Andrew reported more
than that was possible in the presentation regarding these patches but
I didn't have time to debug why our setup wasn't matching those results.
My impression was that some other sites might be running these patches
in production. Can anyone comment if that is the case and if they
are able to saturate links and have the problem described?</span><br><br><t=
t><span style=3D" font-size:10pt"><br>> On 5/6/2021 10:22 PM, John P Jan=
osik (jpjanosi@us.ibm.com) wrote:<br>> > Hi Ben,<br>> > <br>>=
; > We have been importing these patches into our IBM internal OpenAFS
1.8.X <br>> > builds for over a year and have had our busiest cells r=
unning
these <br>> > versions since fall last year. We hit some deadlo=
ck issue
early on but <br>> > that was fixed and I believe those patches made =
it to gerrit
as well.<br>> > <br>> > I did the work to get the patches to ap=
ply to the versions of
OpenAFS we <br>> > are running, but I don't feel confident calling it=
a review.
I missed <br>> > the deadlock issue until we actually put it in=
to production :).<br>> > <br>> > John Janosik<br>> > jpja=
nosi@us.ibm.com<br>> <br>> [attachment "jaltman.vcf" delete=
d by John P Janosik/Rochester/IBM]
<br>> [attachment "OpenPGP=5Fsignature" deleted by John P Jano=
sik/Rochester/IBM]
</span></tt><BR>
--=_alternative 000437CC862586D1_=--