[AFS3-std] Per-file ACLs - a few items for discussion

Fri, 26 Jun 2009 13:11:41 -0400

--On Thursday, June 25, 2009 09:48:49 PM -0400 Marc Dionne 
<marc.c.dionne@gmail.com> wrote:

> A. Inheritance only at file creation.  Once a file is created, its ACL is
> completely independent from the parent directory - it is not affected by
> a subsequent StoreACL on the parent.  This is the simplest model and is
> what is part of my current implementation.

This massively violates the principle of least surprise, because it changes 
the behavor for existing AFS users to one where changing the ACL on a 
directory does not actually have any effect on who has what access to files 
within that directory.  This seems unacceptable to me.

> B. Inherit until an ACL is set.  Until an ACL is specifically set on a
> file, the effective ACL is that of the parent.  A StoreACL on the parent
> changes the effective file ACL only for files with no specific ACL.  A
> problem with this approach is that a client cannot determine if an ACL
> has been previously set on a file, and can therefore not predict the
> result of a StoreACL operation on a directory.

This model is somewhat limited, but is simple and easy to understand.  And, 
in fact, a client _can_ tell whether an ACL has been previously set on a 
file, because ACL manipulation operations should always act on the actual 
ACL, not the effective one, so fetching the ACL of a file that doesn't have 
one will result in some appropriate error.

Note that in this case "client" actually means "user", since the cache 
manager doesn't care what the ACL's are and in fact don't even know unless 
a user asks to see it.  What the CM _does_ know about is particular users' 
effective access rights on individual files, so it can properly separate 
access by multiple users with different rights.

If this approach were adopted, I'd recommend having a way to "unset" the 
ACL on a file, so it goes back to inheriting from its parent.

> C. StoreACL on a directory replaces all file ACLs with the specified
> complete ACL.  Simple, but not practical if you have an existing
> directory to change with many adjusted ACLs
> underneath.

This seems like it would violate the principle of least surprise for people 
using the new functionality, and in any case would be annoying.  It makes 
any use of or dependence on per-file ACL's extremely brittle, because they 
are so easy to inadvertently destroy.

> D. StoreACL on a directory replaces file ACLs with a merge of the
> specified ACL with any existing file ACL.  Better than C, but could still
> be not very practical in some cases.  "merge" would have to be well
> defined.

This sounds great, except for defining what "merge" means and providing 
sufficient control.

> E. Other...?

I'd love to see a partial-inheritance model, where a file ACL can inherit 
from its containing directory with some overrides.  This could be either 
copy-inheritance, where the parent ACL is copied the first time a change is 
made, after which it is fully independent, or "true" inheritance, where 
changes to the inherited ACL continue to affect the effective ACL of the 
file (the latter, of course, could be implemented either by examining both 
file and directory ACL's when performing access checks, or by updating file 
ACL's whenever the directory ACL changes; this is an implementation 
optimization tradeoff).

> The expected behaviour for file moves between directories should also be
> defined.

As well as the behavior for files with links in multiple directories, if 
that ever becomes permitted.

> Note that DFS had the notion of separate default directory ACL and
> default file ACL attached to directories, and there is some provision for
> this in the OpenAFS "fs" code (-id, -if options).  But the code does not
> look usable as-is, and this would be a much more complex direction to
> take, both code-wise and for users.

Right; the openafs 'fs' command code is intended specifically for talking 
to a DFS translator; OpenAFS itself doesn't have these concepts per se.  On 
the other hand, the ACL bits that OpenAFS has on directories mostly apply 
independently to directories and files -- 'lid' apply only to directories, 
while 'rwk' apply only to files within a directory; the only bit that has 
meaning for both is 'a'.

> 2. Hard links
> Now that an ACL can follow a file across directories, is there any reason
> that cross-directory hard links should not be allowed, with the
> restriction that the link and target be within the same volume?  It would
> require an inheritance model like A. above where the file ACL is
> independent of the directory.  This has not been implemented or tested
> yet, but it looks like a fairly simple change.

There are OpenAFS-specific implementation issues here.  The OpenAFS 
fileserver's on-disk format, and the tools that validate that format, make 
the assumption that a file can have only one parent directory.  It may look 
simple to change this, but if you don't work through all of the 
implications, yoy may end up in a situation where people create directory 
structures that the salvager destroys.

In fact, some of this model propagates into volume dumps, which are not 
implementation-specific, so there are some interop/protocol considerations 
here as well.

> 3. FetchStatus permissions
> An assumption in the existing cache manager in OpenAFS is that
> FetchStatus on files is allowed if the directory was readable.

Correct, because that is AFS's model.

> If file
> ACLs are enforced in FetchStatus, this can result in inconsistent
> behaviour that depends on whether the status information for a file is
> already in cache or not.

Only if you decide that a file's ACL controls whether FetchStatus is 
permitted on that file, rather than considering vnode metadata to be part 
of the directory, such that it is permitted to call FetchStatus if you have 
'l' on the directory.  This would be a semantic change to the AFS protocol, 
and as you note, has backward-compatibility implications.

> Some concerns:
> - Similar testing has not been done with other client implementations
> (kafs, arla), so it's unclear how they'll behave in general and with this
> workaround, or whether other workarounds would be needed for them.

I imagine most other clients don't implement the DFS-mode behavior at all, 
and so if they apply some of the same optimizations that OpenAFS does, such 
as assuming that a user has the same rights on all files with the same 
parent, then they will handle cached access rights incorrectly.

> - DFS mode might have some subtle behaviour changes, for instance with
> implicit permissions based on owner and mode bits, etc.

No; the AFS cache manager does not ever do this kind of second-guessing. 
It depends on the fileserver/translator to tell it what the user's 
effective rights are on each vnode.

> - "ACL-aware" OpenAFS clients could be made to use "AFS" mode despite the
> DFS flag, for instance based on the file server capabilities.  They would
> also get fixes in the few places where there are assumptions about
> permissions coming from the directory.

Yes; a per-file-ACL-aware client could determine from a fileserver 
capability bit that it must not share cached access rights across files 
just because they are in the same directory.  However, there are security 
implications here -- since error returns from RPC's, including 
RXGEN_OPCODE, are not authenticated, it is easy for an attacker to make a 
client believe that the fileserver does not support any capabilities. 
Thus, a capability bit must not be used to advertise to clients that they 
should apply a more restrictive access control policy.

A capability bit might be used to advertise to clients that the "DFS-mode" 
flag actually indicates new AFS ACL behavior rather than an actual DFS 
translator.

> 6. Volume moves.
> There's some minor adjustment needed in the dump format - so far I'm
> using an optional ACL id tag and an optional ACL data tag with file
> vnodes.  An ACL-aware server can generate old and new formats and
> determine the right one to use through capabilities - or could also get a
> hint from a command-line option that sets a dump flag.

The right thing to do here is to include the ACL data in new, non-critical 
flags drawn from the proper range so they can be ignored by entities which 
do not understand them.

> There is a potential loss of information (the file ACLs) if we move a
> volume from a new server with ACLs to an old one.  So I'm wondering if
> anything should be done to prevent or warn about this, and if so, what
> mechanism would be available to force it to happen, since it would be a
> useful operation in some cases.

I think it would be useful to introduce a new _critical_ dump tag 
containing a set of flags indicating presence in the dump of features which 
must not be ignored if the dump processor wishes to avoid loss of data or 
"important" metadata.  Those features can then use non-critical tags, with 
the net effect that each dump processor can choose whether to continue 
processing a dump containing unknown features.

This would allow operators to continue using older tools for processing and 
analyzing dumps even when new features are added, or to decide that getting 
a user's data back is more important than preserving the per-file ACL's and 
so instruct a server to accept a dump containing per-file ACL's even though 
it does not understand them.

For this mechanism to work, the new flags tag must be critical, so that 
tools which do not understand it do not inadvertently lose data, but it 
must be defined once in a sufficiently extensible manner, so that once a 
tool does understand the flags, it can make advertent choices to ignore 
future features it does not understand.

-- Jeff