[AFS3-std] rpc refresh: FetchDirectory: discussion only

Sat, 12 Sep 2009 02:05:36 -0400

On Thu, Sep 10, 2009 at 7:00 PM, Matt W. Benjamin<matt@linuxbox.com> wrote:
> Hi,
>
> A few of us (mmeffie, tkeiser, adeason, myself) found ourselves discussin=
g the topic of defining an implementation-agnostic directory listing interf=
ace as part of RPC refresh a month or so ago. =A0We agreed I'd make an atte=
mpt to start some new discussion.
>
> I've tied up some busy people's time, including my own, in trying to elic=
it some discussion in conference.openafs.org. =A0Nothing definitive emerged=
 there, but some good feedback was developed. =A0I'd like to to throw what =
I have prototyped out for discussion here, I don't assert primacy over the =
topic, only don't like wasting people's effort.
>
> The discussion was at a general level, not everyone had time to read the =
grammar I was suggesting.
>
> Key ideas I heard expressed:
>
> 1. FetchDirectory RPC (a possible name) is not covered in 2005 directory =
format extensions writeup (hackathon, sweden), but that may need revisiting=
 (jhutz)
>
> 2. longer filenames are an emerging reality--but people lack intuitions a=
bout how how AFS protocol should adapt
>
> 3. directory entries need multiple representations, perhaps 3 (unicode, l=
egacy 8-bit, and Windows short name)
>
> 4. rpc should permit a paging notion--I'm proposing that can be just be a=
n offset into the list the server is producing at the current dv on DirFid-=
-client can keep advancing by Entries_len until it's read all it wants
>
> 5. directory entries cannot be looked up, as server doesn't know applicab=
le normalization rules
>
> 6. directory entries are not returned in some sorted order (with accompan=
ying complexity), unless someone can really make a strong case to change pr=
ecedent (jhutz)
>

Let me tackle 4 and 6 together, as I think they're intertwined.  Both
of these points seem to implicitly assume the current fileserver
implementation.  I can easily envision a future fileserver that stores
the directory object as, say, a B-tree.  In that case, returning a
sorted subset is rather trivial, and addressing a subset by an ordinal
rather awkward, as you're effectively trying to linearize the
structure by something other than its natural sort order.
Furthermore, the proposed XDR proc requests a subset by an ephemeral
ordinal without specifying the DV which was used to come up with said
ordinal.  Future journalled/logged file servers would have
insufficient information to compute the mapping of that (DV,ord) onto
the subset of the directory object addressed by (DV',ord') that you
really "wanted".

I think the proc needs more fields.  For example, I'd like an OUT
parameter for the server to communicate whether the data is sorted or
unsorted (really, an enumeration in case we want to do something more
advanced in future, e.g. send it in an order that allows for linear
time to build the split stream data into an efficient structure).
Additionally, I really think we should assert our current DV as an IN
param to allow future implementations to do offset mapping should they
so desire (and offset should become INOUT as a result).

As far as the addressing by ordinal issue, my preference would be for
the "primary key" to be a discriminated union.  For the current
generation of file servers, an ordinal makes perfect sense.  However,
I think we should specify a second union entry which is a filename
string.  Upon receipt, file servers supporting this lookup mechanism
would return the block of entries which follow said entry.  This
would, of course, require allocation of new error codes and
capabilities to announce which lookup mechanism(s) the server
supported.

Thoughts?

-Tom