[OpenAFS-devel] OS X 10.5 oddity

Joel hashbang@gmail.com
Sun, 22 Feb 2009 00:11:38 -0600


I'm at a loss of where to even turn to about this problem. I've used
OpenAFS on OSX for years, but recently I've been having some strange
issues. At seemingly random points in AFS operations, my client will
hang and ultimately contact to the fileserver will be lost, then
restored soon after.

Going into detail...

Using the OpenAFS 1.4.8 OSX package, with OSX 10.5.6.
When the client hangs, dmesg repeats this until the "afs: Lost contact
with file server..." message:
in_delayed_cksum_offset: ip_len 51200 (200) doesn't match actual length 214
which means pretty much what it says, that the length in the packet
doesn't match the actual on-wire length. It seems to always be 14
bytes shy (the size of the UDP header??). This packet never makes it
through, so I assume the client times out waiting for it and triggers
the "lost contact" state.

Also, when I watch the traffic on both the client and fileserver, the
offending packet's checksum is wrong.

One more data point I captured, the packet sizes are either 206, 208,
209, 210, 212, 213, 214, or 218 looking back at two weeks of logs.

If anyone can provide insight to this, I would love to hear it. I'd be
willing to test/try anything since this is an annoying problem.

Thanks, Joel