[OpenAFS] connection timeouts

Juha Jäykkä juolja@utu.fi
Mon, 22 May 2006 19:55:38 +0300

> Widereply pruned. Ick.


> > 12:36:59.013846 IP lagrange.tfy.utu.fi.afs3-callback >
> > dirac.tfy.utu.fi.afs3-fileserver:  rx data fs call op#1054934366 (44)
> So is the client sad or is tcpdump?
> There's no RPC "1054934366"

Eh..? Do you mean taht either tcpdump's parser is broken or the client is

> Using -s 1500 -x would provide packet payloads. The last 32 bits of the=20
> abort is an error code. I bet it would be -455 (RXGEN_OPCODE), I think=20
> that's ffff fe38

I'll do that next time: after 9 hours of misbehaving, the client started
working again at 14:40:07 *on its own* (i.e. just like if there had
actually been a network outage - except there never was).

> >    call 0: # 182, state precall, mode: error
> Ok, lagrange.tfy.utu.fi is and that connection is mode=20
> error.

Which means exactly what? And how do I fix it (the next time this

Another curious thing here is that every time this has occurred it's been
with the same fileserver, never any of the others! The server in question
is serving all our RW $HOME's, so it's probably also most heavily used.
Should I try redistributing the home volumes across other servers?


