[OpenAFS-devel] find_preferred_connection: no connection and !create

Ian Wienand iwienand@redhat.com
Mon, 19 Mar 2018 11:14:13 +1100


Hello,

With 1.8.0~pre5 we occasionally get

 [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 10:00:07 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 12:00:06 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 14:00:07 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 16:42:15 2018] find_preferred_connection: no connection and !create
 [Fri Mar 16 18:21:58 2018] find_preferred_connection: no connection and !create

in the kernel logs.  You can see from [1] it's usually around the top
of the hour when mirroring processes start; but not always.  I've had
a look at [2] ... there doesn't seem to be anything obviously tunable
about this?  Is it something we should worry about?

---

For background ... in OpenStack we have based our mirroring
infrastructure off AFS.  We have a single host that updates from
various upstream mirrors to RW volumes then releases them; mirror
hosts in various remote clouds then serve the volumes via apache to
local nodes in their own cloud.

Unfortunately this mirror updater has been very unstable lately.  In
particular, we use "reprepro" to mirror deb-based repositories like
Debian, Ubuntu, Ubuntu Ports, etc. and its on-disk databases are very
sensitive to corruption of files; when it does happen, recovering or
remirroring these big repos is not fun (others we just rsync, which is
much more tolerant to failures).

We were previously running Trusty on this host, which would be openafs
1.6.7 [3].  We'd fairly regularly see things like:

 afs: Lost contact with file server 104.130.138.161 in cell openstack.org (code -512) (all multi-homed ip addresses down for the server)
 afs: failed to store file (110)

and at the fs level we'd end up with files not written or corruption.

Anyway, it didn't seem worth spending time on such old code; we have
upgraded the host to Xenial now, and are using a backport of the
bionic 1.8.0~pre5 packages in a PPA [4].  This is so far working well,
modulo the warning above.

Thanks,

-i

[1] http://paste.openstack.org/show/703825/
[2] https://github.com/openafs/openafs/blob/master/src/afs/afs_conn.c#L80
[3] https://packages.ubuntu.com/trusty/net/openafs-client
[4] https://launchpad.net/~openstack-ci-core/+archive/ubuntu/openafs-1.8-xenial