[OpenAFS-devel] Re: open issues for an openafs 1.8 branch

Andrew Deason adeason@sinenomine.net
Tue, 14 Oct 2014 16:36:32 -0500


On Mon, 29 Sep 2014 12:08:07 -0400
chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil> wrote:

> On Wed, 24 Sep 2014 15:14:39 -0400 (EDT)
> Benjamin Kaduk <kaduk@MIT.EDU> wrote:
> 
> > (1) Should we install .la files for shared libraries?
> > 
> > We are now using libtool to build things.  Some people like to
> > install the .la file for external consumers to use.  Some people
> > don't.  Debian falls in the latter category, which leans me toward
> > it as well.
> 
> Do not ship unless we really have to.  If we start shipping it we will
> always need to ship it after that.

I agree with not shipping a .la, but was the question here to ship a .la
vs a .pc, or just not ship anything?

I thought we'd be providing a pkg-config .pc file for these, but I'm not
sure if that's been brought up at all.

> > (3) Dropping the Netscape plugin and related bits from our tree
> > 
> ...
> > So, do we have consensus in favor of removing these bits from the
> > openafs tree?
> 
> It's unmaintainable since it can't be (reasonably) tested.  Remove it.

Any of this kind of crud can be added back if anyone complains.

Also, it was mentioned that "Netscape" doesn't exist anymore, but I'm
not sure if that's actually true for some value of "exist". This was for
netscape's web server, which I think is now called Oracle iPlanet Web
Server and has an open fork called heliod. Not that it matters (and I'm
not saying it's even compatible with that if it works at all), but the
"thing" that it's for is not entirely nonexistent.

> > (4) changing fileserver tuning
> > 
> > The fileserver lets you pass arguments like -S and -L for "small" and
> > "large" setups.  But ... the "large" one is actually quite small, by
> > today's standards.  We probably ought to update what those coarse-grain
> > settings do, and the defaults as well.
> 
> Good luck with getting agreement about small and large.  Perhaps these
> should be more like the client (with respecting to its auto-tuning of
> the cache size) and use some percentage of memory on the machine.

Yes, these 'small' and 'large' distinctions are useless. I agree that
having something autotune parameters based on memory (like the CM bases
many decisions based on cache size) is the best way to go. Then it's not
"large" vs "small", it's "16G of ram" vs "16M of ram". This should be
intuitive, since almost all (or even all) of the options that just
change numbers are memory/speed tradeoffs.

I don't feel this needs to happen on a big version boundary (so it
wouldn't block branching or 1.8 release etc), but that would be nice.
Something like '-auto 16G' can be done fairly easily; you just need to
pick values for different memory amounts. For something like '-auto 20%'
I think you'd need to add some platform-specific code for memory
probing, which is a little more work. If that is implemented, we can
just have the default be something like '-auto 20%', and have all other
parameters autotuned from that.

I would very much prefer something like that to exist, rather than
adding another "size class", or even just changing the defaults.

> > (5) configure arguments for pam/kauth
> ...
> > whether src/kauth bits get installed.  However, the pam modules we
> > provide are only useful in a kaserver-style environment (i.e.,
> > krb4).  Currently, pam defaults to enabled, and kauth defaults to
> > disabled, and I have not seen anything to indicate that we wish to
> > reenable kauth for the 1.8 branch.
> ...
> > addition to controlling whether the src/kauth bits get installed.
> > Does that seem reasonable?
> 
> Why keep either for the 1.8 branch?  You can stay on the 1.6 stream if
> you need this.  Maintaining stuff that is already done better by others
> (pam_krb5, krb5) seems like a waste of time.

iirc pam_afs is so entrenched with openafs rx and lwp that pulling it
out is annoying (we'd just have a complete copy of rx and lwp and
perhaps some other things). But maybe it is possible to reduce that to a
reasonable level if someone wanted to.

I have a patchset to get it to build for sparcv9 that I've been unsure
of what to do with. Perhaps that could be the impetus for myself or SNA
to pull it out into a separate small project, where buildsystem etc
changes for it won't bother the rest of openafs.

> > Going through the output more carefully, I also found --disable-gtx,
> > --disable-uss, --enable-bitmap-later, and --disable-unix-sockets,
> > which I
> 
> I don't see the reason for --disable-gtx.  --disable-uss

I thought --disable-gtx was to avoid ncurses. I thought I saw somewhere
that maybe this could just be --without-ncurses, and have basically the
same result?

> bitmap-later (and fast-restart) are both just poor versions of DAFS.
> They should be deprecated for 1.8 (accept the options and print a
> warning that the user should switch to DAFS).

These really are not necessarily replaceable by DAFS. They are
effectively replaced by DAFS for everyone except for the people that
complain that DAFS is still too slow for their purposes. I cannot speak
for them (the research labs), but I thought they were still of that
opinion and would appreciate these options not going away completely
yet.

fast-restart was already removed, and turned into a runtime option for
the fileserver (-unsafe-nosalvage).

bitmap-later could get the same treatment, and really seems a bit
orthogonal to DAFS. If bitmap-later actually works and doesn't introduce
serious issues (iirc this has been debated, but I haven't looked at it
much so I can't say), it would be useful to have on all the time. It
just delays loading/calculating the vnode bitmap until it's actually
needed; it's even more lazy-evaluation (for one particular structure)
than DAFS is.

A quick glance suggests that bitmap-later is easy to turn into a runtime
option; the #define doesn't change structure definitions or anything
like that.

> I don't know the reason behind unix sockets.  Unix domain sockets
> gives you a different permission model I guess.

Disabling unix sockets means that *sync protocols go through localhost
ports instead of unix sockets on disk. It can make things very insecure
on the fileserver, since anyone locally can issue fssync-debug and
salvsync-debug commands.

As far as I'm aware, the only reason the non-unix-sockets code paths are
even there is for Windows; maybe the config option is there so it could
be run on Unix to see if the code paths actually work. I don't see a
reason to keep the config option around; someone using it without
knowing what it does could be dangerous.

-- 
Andrew Deason
adeason@sinenomine.net