[OpenAFS-devel] openafs hangs on shutdown with selinux (caused by callback expiration via umount?)
Derrick Brashear
shadow@gmail.com
Wed, 2 Jan 2008 17:48:09 -0500
------=_Part_5230_33201445.1199314089355
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Actually, i bet if you add logging you'll see it hangs on shutting down the
rx listener
On Jan 2, 2008 5:45 PM, Christopher Allen Wing <wingc@umich.edu> wrote:
> Hello,
>
> We've noticed occasional hangs at shutdown when running OpenAFS 1.4.x on
> RHEL5 systems; this seems to be caused by the fact that SELinux is
> restricting network I/O from within the kernel when performed in the
> context of the actual 'umount' process.
>
> I believe this is what's going on:
>
>
> 1. system shutdown is started
>
> 2. AFS init script is called to 'stop'
>
> 3. /etc/rc.d/init.d/openafs-client does 'umount /afs'
>
> 4. umount process is run in 'mount_t' SELinux context
>
> 5. umount process does umount() system call
>
> 6. kernel code flow runs along these lines:
>
> sys_umount("/afs")
>
> ...
>
> mntput(struct vfsmount * corresponding to /afs)
>
> ...
>
> deactivate_super(struct super_block * corresponding to
> /afs)
>
> ...
>
> generic_shutdown_super(struct super_block * corresponding
> to /afs)
>
> ...
>
> (struct super_block * for /afs)->put_super()
>
> afs_shutdown()
>
> 7. Something called by afs_shutdown() attempts to do network I/O,
> I'm guessing perhaps this is expiring open callbacks on the
> fileservers?
>
> 8. SELinux blocks this network I/O because a process running in
> mount_t security context is not permitted to do so according to
> the RHEL5 security policy.
>
>
> We see a large number of SELinux permission denied messages in the kernel
> log like this:
>
> audit(1199237877.841:1837): avc: denied { write } for pid=29174
> comm="umount" lport=7001 scontext=system_u:system_r:mount_t:s0
> tcontext=system_u:system_r:initrc_t:s0 tclass=udp_socket
>
>
> from which I infer that there is some code in AFS which wants to send out
> packets from the AFS client port (7001) during afs_shutdown(). At this
> point I have not yet gotten a stack trace to see what part of the openafs
> module is actually doing this.
>
> I believe that SELinux accounts network traffic to the actual process
> context in which it originates; thus since the 'umount' process is in the
> kernel (the umount() system call) the SELinux policy for mount gets used.
>
>
>
> Has anyone else seen this hang?
>
>
> For the time being I hacked around this by forcing the 'umount' in the
> openafs-client init script to run in an unrestricted SELinux security
> context like this:
>
> ----- /etc/rc.d/init.d/openafs-client ---
>
> stop() {
> ...
> ...
> runcon system_u:system_r:unconfined_t -- /bin/umount /afs
>
>
>
>
> I was not able to reproduce the problem in brief testing, though; it
> seemed to be associated with trying to reboot a RHEL5 host after it had
> been up (and accessing afs) for a day or similar length of time. I
> couldn't get the hang to occur by just booting a machine, using AFS,
> trying to reboot, etc. (I did try some things like putting the network
> cable prior to shutdown, to see if I could somehow make the client act
> differently)
>
> Since I modified the init script as above I have not seen the problem
> recur.
>
>
> I don't have a recommendation for a resolution to this problem at present,
> so I'm asking for ideas from others who might be running OpenAFS in a
> SELinux environment.
>
>
> Thoughts:
>
>
> 1. What does mount/umount do for NFS (in regards to SELinux)?
>
> 2. One solution would be the above hack (basically disable SELinux
> protection for the 'umount' command when run via the openafs
> init script).
>
> 3. Another solution would be to modify the SELinux policy when/if
> necessary (on all RHEL5, suitable Fedora Core releases, etc.).
> This would be a more involved change to the existing OpenAFS
> packaging.
>
> 4. Otherwise we might modify the openafs kernel code so that it
> does not attempt I/O from within afs_shutdown(); i.e., do it
> from within one of the AFS kernel daemons instead.
>
> I have no idea how feasible / desirable this approach would be.
> Again, what does NFS/cifs/etc do here?
>
>
>
> Thanks a lot,
>
> Chris Wing
> wingc@umich.edu
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>
------=_Part_5230_33201445.1199314089355
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Actually, i bet if you add logging you'll see it hangs on shutting down the rx listener<br><br><div class="gmail_quote">On Jan 2, 2008 5:45 PM, Christopher Allen Wing <<a href="mailto:wingc@umich.edu">wingc@umich.edu
</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hello,<br><br>We've noticed occasional hangs at shutdown when running OpenAFS
1.4.x on<br>RHEL5 systems; this seems to be caused by the fact that SELinux is<br>restricting network I/O from within the kernel when performed in the<br>context of the actual 'umount' process.<br><br>I believe this is what's going on:
<br><br><br> 1. system shutdown is started<br><br> 2. AFS init script is called to 'stop'<br><br> 3. /etc/rc.d/init.d/openafs-client does 'umount /afs'<br><br> 4. umount process is run in 'mount_t' SELinux context
<br><br> 5. umount process does umount() system call<br><br> 6. kernel code flow runs along these lines:<br><br> sys_umount("/afs")<br><br> ...<br><br> mntput(struct vfsmount * corresponding to /afs)
<br><br> ...<br><br> deactivate_super(struct super_block * corresponding to /afs)<br><br> ...<br><br> generic_shutdown_super(struct super_block * corresponding to /afs)
<br><br> ...<br><br> (struct super_block * for /afs)->put_super()<br><br> afs_shutdown()<br><br> 7. Something called by afs_shutdown() attempts to do network I/O,<br>
I'm guessing perhaps this is expiring open callbacks on the<br> fileservers?<br><br> 8. SELinux blocks this network I/O because a process running in<br> mount_t security context is not permitted to do so according to
<br> the RHEL5 security policy.<br><br><br>We see a large number of SELinux permission denied messages in the kernel<br>log like this:<br><br> audit(1199237877.841:1837): avc: denied { write } for pid=29174 comm="umount" lport=7001 scontext=system_u:system_r:mount_t:s0 tcontext=system_u:system_r:initrc_t:s0 tclass=udp_socket
<br><br><br>from which I infer that there is some code in AFS which wants to send out<br>packets from the AFS client port (7001) during afs_shutdown(). At this<br>point I have not yet gotten a stack trace to see what part of the openafs
<br>module is actually doing this.<br><br>I believe that SELinux accounts network traffic to the actual process<br>context in which it originates; thus since the 'umount' process is in the<br>kernel (the umount() system call) the SELinux policy for mount gets used.
<br><br><br><br>Has anyone else seen this hang?<br><br><br>For the time being I hacked around this by forcing the 'umount' in the<br>openafs-client init script to run in an unrestricted SELinux security<br>context like this:
<br><br> ----- /etc/rc.d/init.d/openafs-client ---<br><br> stop() {<br> ...<br> ...<br> runcon system_u:system_r:unconfined_t -- /bin/umount /afs<br><br><br><br><br>
I was not able to reproduce the problem in brief testing, though; it<br>seemed to be associated with trying to reboot a RHEL5 host after it had<br>been up (and accessing afs) for a day or similar length of time. I<br>couldn't get the hang to occur by just booting a machine, using AFS,
<br>trying to reboot, etc. (I did try some things like putting the network<br>cable prior to shutdown, to see if I could somehow make the client act<br>differently)<br><br>Since I modified the init script as above I have not seen the problem
<br>recur.<br><br><br>I don't have a recommendation for a resolution to this problem at present,<br>so I'm asking for ideas from others who might be running OpenAFS in a<br>SELinux environment.<br><br><br>Thoughts:
<br><br><br> 1. What does mount/umount do for NFS (in regards to SELinux)?<br><br> 2. One solution would be the above hack (basically disable SELinux<br> protection for the 'umount' command when run via the openafs
<br> init script).<br><br> 3. Another solution would be to modify the SELinux policy when/if<br> necessary (on all RHEL5, suitable Fedora Core releases, etc.).<br> This would be a more involved change to the existing OpenAFS
<br> packaging.<br><br> 4. Otherwise we might modify the openafs kernel code so that it<br> does not attempt I/O from within afs_shutdown(); i.e., do it<br> from within one of the AFS kernel daemons instead.
<br><br> I have no idea how feasible / desirable this approach would be.<br> Again, what does NFS/cifs/etc do here?<br><br><br><br>Thanks a lot,<br><br>Chris Wing<br><a href="mailto:wingc@umich.edu">wingc@umich.edu
</a><br>_______________________________________________<br>OpenAFS-devel mailing list<br><a href="mailto:OpenAFS-devel@openafs.org">OpenAFS-devel@openafs.org</a><br><a href="https://lists.openafs.org/mailman/listinfo/openafs-devel" target="_blank">
https://lists.openafs.org/mailman/listinfo/openafs-devel</a><br></blockquote></div><br>
------=_Part_5230_33201445.1199314089355--