[OpenAFS-devel] Problem with mounts in AFS on CentOS 7.4 with openafs 1.6.2[01].1

Ragnar Sundblad ragge@csc.kth.se
Fri, 3 Nov 2017 14:46:38 +0100


Hi all,

We have compute clusters where the nodes have almost everything of their =
roots in afs; most things in /, as /etc and /usr, are soft links into a =
complete os installation in afs. To be able to have some writable files =
and directories, such as /etc/adjtime or /var/tmp, we bind mount files =
and directories in the tree which is actually in afs (mainly using the =
rwtab functionality), and a lustre client that also gets mounted in the =
afs tree.

When we upgraded from CentOS 7.3 to 7.4, kernel =
3.10.0-693.5.2.el7.x86_64, and using OpenAFS client 1.6.21.1 or =
1.6.20.1, when users having home directories in afs log in and start =
accessing their data, mounts in the afs tree starts to get randomly =
unmounted. In the lustre case, the lustre client nicely reports that it =
unmounts, so the unmounts seem to be handled in an orderly manner.

We have a suspicion this may be related to the problem reported in the =
thread =E2=80=9Cgetcwd() error for RHEL 7.4 kernel=E2=80=9D, and that =
the kernel for some reason decides that path to the mount point is no =
good and unmounts.
In addition, when this has started to happen, we are not able to mount =
anything more into afs, mount returns ENOENT.

This is pretty easy to repeat.

Our workaround for now is to use the tpmfs based root all the way down =
to the mount points, and have soft links into afs further down for the =
rest, which seems to work.

Please let us know if we can provide any help debugging this.


/ragge

PDC Center for High Performance Computing, KTH Royal Institute of =
Technology, Stockholm, Sweden