[OpenAFS-devel] openafs hangs on shutdown with selinux (caused by callback expiration via umount?)

Derrick Brashear shadow@gmail.com
Wed, 2 Jan 2008 17:48:09 -0500


------=_Part_5230_33201445.1199314089355
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Actually, i bet if you add logging you'll see it hangs on shutting down the
rx listener

On Jan 2, 2008 5:45 PM, Christopher Allen Wing <wingc@umich.edu> wrote:

> Hello,
>
> We've noticed occasional hangs at shutdown when running OpenAFS 1.4.x on
> RHEL5 systems; this seems to be caused by the fact that SELinux is
> restricting network I/O from within the kernel when performed in the
> context of the actual 'umount' process.
>
> I believe this is what's going on:
>
>
>        1. system shutdown is started
>
>        2. AFS init script is called to 'stop'
>
>        3. /etc/rc.d/init.d/openafs-client does 'umount /afs'
>
>        4. umount process is run in 'mount_t' SELinux context
>
>        5. umount process does umount() system call
>
>        6. kernel code flow runs along these lines:
>
>                sys_umount("/afs")
>
>                ...
>
>                mntput(struct vfsmount * corresponding to /afs)
>
>                ...
>
>                deactivate_super(struct super_block * corresponding to
> /afs)
>
>                ...
>
>                generic_shutdown_super(struct super_block * corresponding
> to /afs)
>
>                ...
>
>                (struct super_block * for /afs)->put_super()
>
>                afs_shutdown()
>
>        7. Something called by afs_shutdown() attempts to do network I/O,
>           I'm guessing perhaps this is expiring open callbacks on the
>           fileservers?
>
>        8. SELinux blocks this network I/O because a process running in
>           mount_t security context is not permitted to do so according to
>           the RHEL5 security policy.
>
>
> We see a large number of SELinux permission denied messages in the kernel
> log like this:
>
>        audit(1199237877.841:1837): avc:  denied  { write } for  pid=29174
> comm="umount" lport=7001 scontext=system_u:system_r:mount_t:s0
> tcontext=system_u:system_r:initrc_t:s0 tclass=udp_socket
>
>
> from which I infer that there is some code in AFS which wants to send out
> packets from the AFS client port (7001) during afs_shutdown().  At this
> point I have not yet gotten a stack trace to see what part of the openafs
> module is actually doing this.
>
> I believe that SELinux accounts network traffic to the actual process
> context in which it originates; thus since the 'umount' process is in the
> kernel (the umount() system call) the SELinux policy for mount gets used.
>
>
>
> Has anyone else seen this hang?
>
>
> For the time being I hacked around this by forcing the 'umount' in the
> openafs-client init script to run in an unrestricted SELinux security
> context like this:
>
>        ----- /etc/rc.d/init.d/openafs-client ---
>
>        stop() {
>                ...
>                ...
>                runcon system_u:system_r:unconfined_t -- /bin/umount /afs
>
>
>
>
> I was not able to reproduce the problem in brief testing, though; it
> seemed to be associated with trying to reboot a RHEL5 host after it had
> been up (and accessing afs) for a day or similar length of time.  I
> couldn't get the hang to occur by just booting a machine, using AFS,
> trying to reboot, etc.  (I did try some things like putting the network
> cable prior to shutdown, to see if I could somehow make the client act
> differently)
>
> Since I modified the init script as above I have not seen the problem
> recur.
>
>
> I don't have a recommendation for a resolution to this problem at present,
> so I'm asking for ideas from others who might be running OpenAFS in a
> SELinux environment.
>
>
> Thoughts:
>
>
>        1. What does mount/umount do for NFS (in regards to SELinux)?
>
>        2. One solution would be the above hack (basically disable SELinux
>           protection for the 'umount' command when run via the openafs
>           init script).
>
>        3. Another solution would be to modify the SELinux policy when/if
>           necessary (on all RHEL5, suitable Fedora Core releases, etc.).
>           This would be a more involved change to the existing OpenAFS
>           packaging.
>
>        4. Otherwise we might modify the openafs kernel code so that it
>           does not attempt I/O from within afs_shutdown(); i.e., do it
>           from within one of the AFS kernel daemons instead.
>
>           I have no idea how feasible / desirable this approach would be.
>           Again, what does NFS/cifs/etc do here?
>
>
>
> Thanks a lot,
>
> Chris Wing
> wingc@umich.edu
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>

------=_Part_5230_33201445.1199314089355
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Actually, i bet if you add logging you&#39;ll see it hangs on shutting down the rx listener<br><br><div class="gmail_quote">On Jan 2, 2008 5:45 PM, Christopher Allen Wing &lt;<a href="mailto:wingc@umich.edu">wingc@umich.edu
</a>&gt; wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hello,<br><br>We&#39;ve noticed occasional hangs at shutdown when running OpenAFS 
1.4.x on<br>RHEL5 systems; this seems to be caused by the fact that SELinux is<br>restricting network I/O from within the kernel when performed in the<br>context of the actual &#39;umount&#39; process.<br><br>I believe this is what&#39;s going on:
<br><br><br> &nbsp; &nbsp; &nbsp; &nbsp;1. system shutdown is started<br><br> &nbsp; &nbsp; &nbsp; &nbsp;2. AFS init script is called to &#39;stop&#39;<br><br> &nbsp; &nbsp; &nbsp; &nbsp;3. /etc/rc.d/init.d/openafs-client does &#39;umount /afs&#39;<br><br> &nbsp; &nbsp; &nbsp; &nbsp;4. umount process is run in &#39;mount_t&#39; SELinux context
<br><br> &nbsp; &nbsp; &nbsp; &nbsp;5. umount process does umount() system call<br><br> &nbsp; &nbsp; &nbsp; &nbsp;6. kernel code flow runs along these lines:<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sys_umount(&quot;/afs&quot;)<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mntput(struct vfsmount * corresponding to /afs)
<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;deactivate_super(struct super_block * corresponding to /afs)<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;generic_shutdown_super(struct super_block * corresponding to /afs)
<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(struct super_block * for /afs)-&gt;put_super()<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;afs_shutdown()<br><br> &nbsp; &nbsp; &nbsp; &nbsp;7. Something called by afs_shutdown() attempts to do network I/O,<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; I&#39;m guessing perhaps this is expiring open callbacks on the<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fileservers?<br><br> &nbsp; &nbsp; &nbsp; &nbsp;8. SELinux blocks this network I/O because a process running in<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mount_t security context is not permitted to do so according to
<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; the RHEL5 security policy.<br><br><br>We see a large number of SELinux permission denied messages in the kernel<br>log like this:<br><br> &nbsp; &nbsp; &nbsp; &nbsp;audit(1199237877.841:1837): avc: &nbsp;denied &nbsp;{ write } for &nbsp;pid=29174 comm=&quot;umount&quot; lport=7001 scontext=system_u:system_r:mount_t:s0 tcontext=system_u:system_r:initrc_t:s0 tclass=udp_socket
<br><br><br>from which I infer that there is some code in AFS which wants to send out<br>packets from the AFS client port (7001) during afs_shutdown(). &nbsp;At this<br>point I have not yet gotten a stack trace to see what part of the openafs
<br>module is actually doing this.<br><br>I believe that SELinux accounts network traffic to the actual process<br>context in which it originates; thus since the &#39;umount&#39; process is in the<br>kernel (the umount() system call) the SELinux policy for mount gets used.
<br><br><br><br>Has anyone else seen this hang?<br><br><br>For the time being I hacked around this by forcing the &#39;umount&#39; in the<br>openafs-client init script to run in an unrestricted SELinux security<br>context like this:
<br><br> &nbsp; &nbsp; &nbsp; &nbsp;----- /etc/rc.d/init.d/openafs-client ---<br><br> &nbsp; &nbsp; &nbsp; &nbsp;stop() {<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;...<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;runcon system_u:system_r:unconfined_t -- /bin/umount /afs<br><br><br><br><br>
I was not able to reproduce the problem in brief testing, though; it<br>seemed to be associated with trying to reboot a RHEL5 host after it had<br>been up (and accessing afs) for a day or similar length of time. &nbsp;I<br>couldn&#39;t get the hang to occur by just booting a machine, using AFS,
<br>trying to reboot, etc. &nbsp;(I did try some things like putting the network<br>cable prior to shutdown, to see if I could somehow make the client act<br>differently)<br><br>Since I modified the init script as above I have not seen the problem
<br>recur.<br><br><br>I don&#39;t have a recommendation for a resolution to this problem at present,<br>so I&#39;m asking for ideas from others who might be running OpenAFS in a<br>SELinux environment.<br><br><br>Thoughts:
<br><br><br> &nbsp; &nbsp; &nbsp; &nbsp;1. What does mount/umount do for NFS (in regards to SELinux)?<br><br> &nbsp; &nbsp; &nbsp; &nbsp;2. One solution would be the above hack (basically disable SELinux<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; protection for the &#39;umount&#39; command when run via the openafs
<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; init script).<br><br> &nbsp; &nbsp; &nbsp; &nbsp;3. Another solution would be to modify the SELinux policy when/if<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; necessary (on all RHEL5, suitable Fedora Core releases, etc.).<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; This would be a more involved change to the existing OpenAFS
<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; packaging.<br><br> &nbsp; &nbsp; &nbsp; &nbsp;4. Otherwise we might modify the openafs kernel code so that it<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; does not attempt I/O from within afs_shutdown(); i.e., do it<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; from within one of the AFS kernel daemons instead.
<br><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; I have no idea how feasible / desirable this approach would be.<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Again, what does NFS/cifs/etc do here?<br><br><br><br>Thanks a lot,<br><br>Chris Wing<br><a href="mailto:wingc@umich.edu">wingc@umich.edu
</a><br>_______________________________________________<br>OpenAFS-devel mailing list<br><a href="mailto:OpenAFS-devel@openafs.org">OpenAFS-devel@openafs.org</a><br><a href="https://lists.openafs.org/mailman/listinfo/openafs-devel" target="_blank">
https://lists.openafs.org/mailman/listinfo/openafs-devel</a><br></blockquote></div><br>

------=_Part_5230_33201445.1199314089355--