[OpenAFS] converting Kaserver and protection server to working with LDAP

Jeffrey Hutzelman jhutz@cmu.edu
Thu, 7 Jun 2001 18:33:40 -0400 (EDT)


On Mon, 4 Jun 2001, Marcus Watts wrote:

> I was wrong in the above in that the cache manager itself doesn't
> talk PT.  Lucky you.  However, the cache manager does know about viceIDs and
> ACLs, expects there to be a 1-1 mapping between AFS viceIDs and Unix
> UIDs, and expects ACLs to contain a list of viceIDS (which are actually
> manipulated using the "fs sa" and "fs la" commands.)  Breaking this means
> you won't be compatible with vanilla AFS (interoperability may be
> a problem.)

While there are potential problems, these really aren't...

- The cache manager knows nothing about the relationship between vice ID's
  and UNIX UID's.  It sometimes tracks credentials by UNIX UID (when there
  is no pag), but that is the extent of its knowledge of UNIX UID's.

- The cache manager never sees vice ID's on ACL's.  In fact, the only time
  it sees ACL's at all is when you manipulate them via the 'fs' commands,
  which do all their work via cache manager calls.  Even then, the ACL's
  the cache manager sees are those exported by the fileserver's FetchACL
  and StoreACL calls, which use _names_, not vice ID's.

> Basically, your ldap directory is going to need to contain the following
> attributes:
> 	name aka "kerberos name".
> 	ViceID aka UID
> 	many-many relationship between user viceIDs and group viceIDs.
> and the following operations:
> 	pr_NameToId
> 		map name to ID
> 	pr_IdToName
> 		map ID to name
> 	pr_GetCPS
> 		given ID, map into list of group IDs (and userID)
> 		one is or is a member of.
> Users and groups share the same name and ID space.
> ptserver gives groups negative viceIDs; I don't know if openafs
> knows this.  There are more operations - which you should support if
> you also want the AFS "pts" command to work.

Yes; openafs knows that group ID's are negative, in the places where it
matters.  Offhand, I can't think of any such places outside pts and the
ptserver.

> One basic issue: ldap uses TCP.  PT uses RX.  Threading may be
> interesting, especially at start-up when doing name resolution
> and ldap bind.  TCP has more start-up overhead, & interesting limits
> on the number of simultaneous connections allowed.  If you botch
> things badly, the filserver will appear to mysteriously hang as
> calls to ptserver stall.  This is bad.

Actually, it can be worse than bad.  Years ago, one of the failure modes
of AFS was commonly known as a "ptserver meltdown".  This occurred when
a ptserver problem caused a backlog of GetCPS requests from fileservers
that would essentially grow without bound.  The result would be that the
ptservers would become and _stay_ heavily overloaded, and fileservers
would start to hang waiting for responses.

There have been some changes since then that make this scenario less
likely, but it is still the case that the ptserver is probably the most
heavily loaded of the AFS database services, and the majority of the
requests it services are expensive ones.  Think very carefully before
replacing it. 

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA