[OpenAFS] best practice for a service to access a user AFS token? and why ruid instead of euid?

Thu, 17 Nov 2016 13:27:16 -0600

Hi OpenAFS gurus, I am in desperate need of your advice!

We are adding OpenAFS support to HTCondor (http://htcondor.org).

I read in the docs that the Cache Manager identifies token by either the 
user's UNIX UID or by a process authentication group (PAG).  In the 
non-PAG case, I expected the cache manager to identify the token by 
_effective_ uid.  But my testing implies that the token is identified by 
the _real_ uid.   Is there a way to change this behavior?

My situation is hopefully not new or unique : I have a long-running 
service (HTCondor in this case) running as root that is implemented via 
a cooperating set of processes, and this service needs to impersonate 
different users in order read/write to the filesystem on their behalf. 
Ideally I'd like this service to be as ignorant of AFS is possible.  It 
currently performs impersonation by changing out the effective UID, 
which of course works fine for the local filesystem.

Meanwhile, I have another service running (a home-brewed "credential 
manager") on the machine that keeps AFS tokens refreshed for all users 
of the service by doing an "aklog" as each user.  Both HTCondor and the 
credential manager are not running in a PAG, as my understanding is a 
PAG can only hold one user token per cell.

The problem is the HTCondor service cannot access AFS on behalf of the 
user being impersonated by simply switching effective UID; it apparently 
needs to switch over the real UID as well.  Thanks to the saved UID and 
the setresuid() syscall, it is trivial to change HTCondor to switch both 
the read and effective UID over to the user being impersonated, and 
still switch back to root afterwards.  Doing so would likely make 
everything work great with AFS.  But the problem is changing the 
real-uid is a potential security hole, as now HTCondor service processes 
could receive signals (i.e. SIGKILL!) from the unprivileged users it is 
simply trying to impersonate.   This is why OpenAFS's reliance on using 
the real UID for identifying tokens seems very broken to me; seems like 
it should be euid (or specifically on Linux clients, the file-system 
uid) like every other similar system...

Any thoughts/advice?

One idea is I could have all the HTCondor service processes run in their 
own PAG and perform some sort of very lightweight "aklog" whenever 
impersonation is required.  But I need this "aklog" to be very 
fast/lightweight; I don't want to be blocked on network communication to 
some KDC or even file I/O if I can help it.  The service would have 
access to a KRB5 credential cache for the user that has an AFS service 
principal; given that, is there some lightweight "aklog"-like in-process 
library call I can use (that avoids I/O)?  Something the HTCondor 
service could dlopen()/dlsym() (so I don't introduce a pile of 
dependencies) ?

Another idea, as I only care about supporting Linux, would be to 
leverage Linux kernel keyrings.  I am thinking perhaps my credential 
manager could link the "afs_pag: _pag" key to the user keyring, and then 
the HTCondor service could link this key into its session keyring when 
impersonating.  Does anyone think that would work?  Or is there more to 
swapping PAGs in and out (i.e. besides the key on the keyring, pagsh 
seems to do some magic with groups as "/bin/id" shows some magic groups 
when in a PAG...) ?  Is the keyring-based idea crazy talk or a good idea 
to pursue if Linux is my only target?

I've seen the great lengths (i.e. immense amount of code, security 
side-steps of creating their own krb4 tickets) that Samba has done to 
support AFS; I am hoping there is an easier way.

Your suggestions greatly appreciated.

thanks
Todd

-- 
Todd Tannenbaum <tannenba@cs.wisc.edu> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences