[OpenAFS-devel] find_preferred_connection: no connection and
!create
Benjamin Kaduk
kaduk@mit.edu
Mon, 19 Mar 2018 04:27:02 -0500
On Mon, Mar 19, 2018 at 11:14:13AM +1100, Ian Wienand wrote:
> Hello,
>
> With 1.8.0~pre5 we occasionally get
>
> [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 10:00:07 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 12:00:06 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 14:00:07 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 16:42:15 2018] find_preferred_connection: no connection and !create
> [Fri Mar 16 18:21:58 2018] find_preferred_connection: no connection and !create
>
> in the kernel logs. You can see from [1] it's usually around the top
> of the hour when mirroring processes start; but not always. I've had
> a look at [2] ... there doesn't seem to be anything obviously tunable
> about this? Is it something we should worry about?
I think it should be harmless, and should probably be removed.
It looks like the only place where we call
find_preferred_connection() with create == 0 is within
afs_ConnByHost(), where we first check if there's an existing
connection to reuse, and if not, we create one. So this message
would just be telling us that we are not reusing a cached connection
and had to make a new one, which is mostly of interest only to the
developer working on the code.
> ---
>
> For background ... in OpenStack we have based our mirroring
> infrastructure off AFS. We have a single host that updates from
> various upstream mirrors to RW volumes then releases them; mirror
> hosts in various remote clouds then serve the volumes via apache to
> local nodes in their own cloud.
>
> Unfortunately this mirror updater has been very unstable lately. In
> particular, we use "reprepro" to mirror deb-based repositories like
> Debian, Ubuntu, Ubuntu Ports, etc. and its on-disk databases are very
> sensitive to corruption of files; when it does happen, recovering or
> remirroring these big repos is not fun (others we just rsync, which is
> much more tolerant to failures).
>
> We were previously running Trusty on this host, which would be openafs
> 1.6.7 [3]. We'd fairly regularly see things like:
>
> afs: Lost contact with file server 104.130.138.161 in cell openstack.org (code -512) (all multi-homed ip addresses down for the server)
> afs: failed to store file (110)
>
> and at the fs level we'd end up with files not written or corruption.
>
> Anyway, it didn't seem worth spending time on such old code; we have
> upgraded the host to Xenial now, and are using a backport of the
> bionic 1.8.0~pre5 packages in a PPA [4]. This is so far working well,
> modulo the warning above.
That's great feedback to hear; thanks!
-Ben