[OpenAFS] Clients are blocked with error code -3 of RXAFSCB_ProbeUuid

Benjamin Kaduk kaduk@mit.edu
Mon, 27 Apr 2020 15:29:58 -0700


On Mon, Apr 27, 2020 at 09:16:14AM +0800, huangql wrote:
> Hello All,
> 
> 
> We found some clients blocked. And no more operations are available under /afs instance like “cd”"ls", all of which are blocked.
> 
> We can see some log message on server side to know the error code -3
> 
> 
> Mon Apr 27 08:00:34 2020 CheckHost_r: Probing all interfaces of host 192.168.63.194:7001 failed, code -3
> Mon Apr 27 08:07:37 2020 CheckHost_r: Probing all interfaces of host 192.168.63.219:7001 failed, code -3
> 
> It failed to restart afs service to resume the /afs excepting restarting the client nodes.
> 
> Does someone have the similar cases? Any suggestions would be appreciated. Thanks.

That's an interesting error code to be seeing;
https://www.central.org/pages/numbers/errors.html shows -3 as
RX_CALL_TIMEOUT, which does not seem to match your description of the
issue.  A brief glance at the code indicates that we can also generate this
error locally if our clock is moving backwards a lot.

I don't expect the above to be helpful, and don't recall any similar cases,
but figured it is better to reply with what little I know than to leave
your message with no reply.

-Ben