[OpenAFS] vos release stops at 2^64 packets sent.

Mark Vitale mvitale@sinenomine.net
Tue, 28 Jun 2022 03:14:34 +0000


> On 27 Jun 2022, at 3:18 PM, Richard Brittain <Richard.Brittain@dartmouth.=
edu> wrote:
>=20
> I know this is a long shot, but I've got a no-quota volume of approx 6TB,=
 and I'm trying to replicate it.  It appears to be going fine until the pac=
ketRead counter reaches 2^64 and then it stops (doesn't abort).

Are you sure it's 2^64?  The rx_call->rnext member is the source of the pac=
ketRead counter, and it is type afs_uint32, so it should roll over at 2^32 =
(4294967296 packets).  As far as I can tell, nothing would change in this r=
egard in 1.8.x.   However, that should be plenty to move 6TB at 800-1400 by=
tes per packet.  So maybe something else is going on - a loop in the dump p=
erhaps, or something else. =20

Just to get things started, could you please issue the following to check t=
he relevant MTUs for the source and target volservers:

  $ rxdebug <sourcevolserver> 7005 -noconn -peers -onlyport 7005
  $ rxdebug <targetvolserver> 7005 -noconn -peers -onlyport 7005

> Servers are 1.6.22 (I thought I'd retire then before now, so didn't bothe=
r upgrading to 1.8.x).  If 1.8 might change this limit, I can upgrade, but =
I didn't find any hints in the release notes.
>=20
> Based on how long it ran, my guess is > 5TB was transferred.
> Is this affected by volser buffer sizes ?

I don't know the answer to this off the top of my head; perhaps some others=
 on the list can chime in.
Or I may think of something else after I sleep on it.

Regards,
--
Mark Vitale
Sine Nomine Associates