[OpenAFS-devel] 1.8.11pre1 client hanging on Linux 6.7

Cheyenne Wills cwills@sinenomine.net
Thu, 25 Jan 2024 08:12:39 -0700


Can you run a packet capture from within the client itself?

We've seen an occasional client problem when the traffic was going
through a vpn / firewall, and some of the udp packets were getting
truncated.


--=20
Cheyenne Wills
cwills@sinenomine.net



On Wed, 24 Jan 2024 19:49:02 +0100
Michael La=C3=9F <lass@mail.upb.de> wrote:
> Thanks Cheyenne for trying to reproduce this issue. We are both using
> the exact same versions of the Linux kernel and OpenAFS, so the
> difference in behavior is quite interesting. Unfortunately, I still
> cannot really make sense of this problem. I am seeing two slightly
> different failure modes:
>=20
>=20
> 1. When trying to access my test cell, which is actually a VirtualBox
> VM running  on the same machine as the client, `ls` hangs on the
> following syscall:
>=20
> openat(AT_FDCWD, "/afs/fritz.box",
> O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY
>=20
> Using Wireshark, I looked at the RX network traffic and it looks like
> this:
> https://homepages.upb.de/lass/openafs/RX_traffic_accessing_test_cell.png
>=20
> So it looks like the server is sending a reply to the VLDB request
> multiple times, because there is no acknowledgment from the client.
>=20
>=20
> 2. When trying to access a public cell, in this case desy.de, `ls`
> gets past the `openat` syscall and hangs within getdents64:
>=20
> openat(AT_FDCWD, "/afs/desy.de",
> O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) =3D 3 newfstatat(3, "",
> {st_mode=3DS_IFDIR|0755, st_size=3D6144, ...}, AT_EMPTY_PATH) =3D 0
> getdents64(3,=20
>=20
> Looking at the RX packages, the initial communication contains some
> "Ack Delay" packages:
> https://homepages.uni-paderborn.de/lass/openafs/RX_traffic_accessing_desy=
1.png
>=20
> ... and then seems to be stuck in a loop with "FS Reply"s and pings:
> https://homepages.uni-paderborn.de/lass/openafs/RX_traffic_accessing_desy=
2.png
>=20
>=20
> In this comparison, the server versions differ as well, which may
> contribute to the difference in communication:
>=20
> # fritz.box:
> % rxdebug afs.fritz.box 7000 -version
> Trying 192.168.178.230 (port 7000):
> AFS version: OpenAFS 1.8.9-1-debian 2022-12-22
>=20
> # desy.de:
> % rxdebug 131.169.2.111 7000 -version
> Trying 131.169.2.111 (port 7000):
> AFS version: AuriStor 2021.05 built 2023-12-19
>=20
> But I saw similar behavior as with desy.de with kth.se which runs
> OpenAFS 1.8.9.
>=20
>=20
> So far, I have not received any complaints by other Arch Linux users
> who use my packages. So this may very well be an isolated issue that
> only affects my system.
>=20
> Best regards,
> Michael
>=20
>=20
> Am Montag, dem 22.01.2024 um 09:19 -0700 schrieb Cheyenne Wills:
>  [...] =20
>=20
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel