[OpenAFS] aklog and AFS DB server timeouts

RL rainer.laatsch@t-online.de
Fri, 29 Jan 2021 20:38:41 +0100


On the relevant clients, are all three with full name in /etc/hosts ? 
Else failure is standard as

   192.168.*.*
is a private thingie that never gets resolved with DNS
Regards, R.

------------------------------------------------------------------------------------------------------------

On 1/29/21 7:32 PM, A. Lewenberg wrote:
> On our buster servers the OpenAFS client (1.8.2) has an issue with 
> provisioning an AFS token. When I attempt to get an AFS token it very 
> often takes a long time.
>
> $ aklog (this can up to 30 seconds or more)
>
> After some investigation it looks like aklog is trying the AFS DB 
> servers listed in /etc/openafs/CellSrvDB and timing out on some of the 
> DB servers. Here is the relevant contents of that file:
>
> >example.com           # My Company
> 192.168.1.102                    #afsdb1.example.com
> 192.168.1.104                    #afsdb2.example.com
> 192.168.1.106                    #afsdb3.example.com
>
> Running aklog and sniffing the network I see that the client attempts 
> to contact one of the three afsdb servers. If the one it chooses to 
> contact first is afsdb2 or afsdb3 the connection does not succeed 
> until it finally gives up and tries anther one. If the second one it 
> tries is afsdb2 or afsdb3 it gives up and tries the only remaining 
> one: afsdb1. In other words:
>
> afsdb3 (fail), afsdb2 (fail), afsdb1 (succeeds)
> afsdb2 (fail), afsdb3 (fail), afsdb1 (succeeds)
> afsdb3 (fail), afsdb1 (succeeds)
> afsdb2 (fail), afsdb1 (succeeds)
> afsdb1 (succeeds)
>
> This sounds like both afsdb2 and afsdb3 are simply not working. 
> However...
>
> If I remove afsdb1 and afsdb2 from the CellSrvDB leaving only afsdb3 
> it works instantly every time! That is, the following CellSrvDB works 
> without delay:
>
> >ir.example.com           # My Company
> 192.168.1.106                    #afsdb3.example.com
>
> Similarly, if afsdb2 is the only entry in CellSrvDB running aklog 
> works without delay. So it cannot be that afsdb2 and afsdb3 are 
> completely broken.
>
> The AFS DB servers are running OpenAFS version 1.6.9.
>
> What the heck is going on?
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info