[OpenAFS-devel] OpenAFS Development

Jack Neely jjneely@pams.ncsu.edu
Sun, 27 Jun 2004 17:06:32 -0400


On Fri, Jun 25, 2004 at 10:42:27AM -0400, Derrick J Brashear wrote:
> On Thu, 24 Jun 2004, Jack Neely wrote:
> 
> > Before I start please refrain from all "they hate us" and "that's just a
> > bitch to fix" replies.  I'm tired of it and it stifles development.  Would
> > you want to work on something that you just got told is "just a bitch?"
> 
> Given that you admit that, then you should be able to understand why in
> response to preceived lack of cooperation in moving to a model where
> sys_call_table doesn't need to be hooked for PAGs, why nothing has been
> done? (*)

It does follow.  :-)  Its just frustrating to watch folks not cooperate
for no real good reason.

> 
> > In return I do offer my services as limited as they are.
> 
> Suggestions are inline.
> 
> > with the 2.6.6-1.435.  Arla CVS built, installed, and worked right out
> > of the box.  PAGs.  I had PAGs.  In fact, Arla will try to hook into the
> > LSM first, and failing that, hooks the syscall table.  For our
> > convenience, I have attached the code that finds sys_call_table.
> 
> The hostility engendered toward OpenAFS on the basis of having this
> suggests it's worth letting this hack die. Perhaps if we demonstrate our
> willingness to move forward there will be some actual cooperation to help
> us move forward.

Well, I agree.  We do need to move to a cleaner system to implement
pags.  However, OpenAFS is mission critical to me and while we work
toward a new system of handling pags we've got to have something else to
fall back on.  Not having a fully functional client for 2.6 is soon to
become not an option for me.  I like the idea that the Arla folks have
used, first try a better system for pags, if that fails, fall back on
hooking the sys_call_table.

Looking through the source I did notice a few things.  The configure
script when given the --enable-redhat-buildsys option tries to test to
see if certain symbols are exported.  This fails on 2.6 because
configure uses gcc to build a kernel module instead of the kbuild
system.  I'm not sure how you would teach autoconf to do that.  But,
that's no biggie, no useful symbols for sys calls are exported anyway.
Also, 'linux/syscall.h' should be 'linux/syscalls.h' but again, no
useful symbols are exported.

Tomas, can you shed some light on how the code in
nnpfs_syscalls-lossage.c works?  I understand how the code in
osi_modules.c calculates the start of the sys_call_table but your's is
getting the better of me.  I'd rather not have a patch that's a cut n'
paste job when I should get the extra code cleanly implemented within
the other methods for locating sys_call_table in osi_module.c.

> 
> > there needs to be something in the stock kernel that can generically
> > handle some sort of authentication key per process.  Recently, there was
> > a very productive conversation on LKML about a "key-ring" patch between
> > David Howells and Kyle Moffett.  A link to it was posted here after the
> > first few messages, but now a rather complex and generic system has been
> > laid out.  I assumed the OpenAFS folks would be very interested in this
> > and give some feedback, but there's been silence.  I've spoken with Kyle
> 
> I don't make a habit of repeating myself. I said on linux-kernel what we
> were trying to replace; Nothing has changed.
> 
> Right now, I can have processes with multiple uids in a single PAG, and
> processes with a single uid in different disjoint PAGs. I use that
> functionality. So do others. Kyle's earlier statements led to my belief
> that he didn't mean to do that. So, in my mind it was a proposal to
> implement something other than PAGs. Whatever. But later discussion seemed
> more likely to be useful.
> 
> If something useful happens, we'll use it. If nothing useful happens,
> well, we can't use it. Attempting to be involved (myself, and I can only
> really speak for me on that basis) produced nothing useful, so I'll wait
> and let people who aren't necessarily perceived as having baggage do it;
> It's not sour grapes, I want to use this functionality...
> 

*nod*  I will keep following what comes of the recent conversation on
lkml.

> I'd take a simple and generic system: process labels.
> 
> Comments about caching removed. If you want to push there, testing along
> the lines of showing where optimization is needed (rather than simply
> that is it) would be useful.
> 
> > Finally, NPTL.  We are starting to look at deploying OpenAFS servers to
> > replace Transarc code.  What's involved in looking at the NPTL issues in
> > OpenAFS?  These need to be fixed, not worked around.  The workaround
> > will probably go away soon.  One of my next tasks is to look into this
> > more closely.
> 
> I'm not aware of anyone who knows yet. What would make it easier would be
> if gdb had something like Solaris dbx's "thread -blockedby", but failing
> that the suggestion I made the other day (when it hangs, use gdb's
> generate-core-file and then get backtraces from all threads) would seem to
> be a start. Alternately, knowing how to reproduce beyond "let it run for a
> while".

I think I can replicate it...sorta.  I run into this when I move a large
volume from linux/openafs to a solaris/transarc server.  I'll see if I
can get some dumps.  It just...takes a beating then maybe it will hang
tomorrow.

> 
> > In conclusion, I'm very blown away by Arla.  OpenAFS has always worked a
> > little better for us as a client but now the tables are turned.  Both
> > projects are Open Source, would a look at some of this code help us?
> 
> It might for using Linux Security Module, and frankly the only reason I
> never did that was the impression I got was it wouldn't be being shipped
> enabled by vendors. If that's untrue, well, I'd be on that.
> 

>From what I am feeling, the LSM is going to be quite popular and I think
would be a reasonable dependency.  Fedora Core is using it.

However, LSM does have some quirks that I'm sure Tomas will agree with.
I tried to write a PAG module using the LSM hooks just to be a generic,
third-party PAG module.  The two hooks task_alloc_security and
task_free_security are perfect.  They are called right after and right
before a process is created or destroyed, respectively.  Both are handed
a fully populated task_struct for the process in question.  I was
storing the pag information in a separate table.

The quirk is that you can "stack" these security modules...but only with
the cooperation of the modules below you on the stack.  The SELinux
module will let you stack 1 module on top of it but blocks all but
a few functions.  None of them being useful for pags.  If the
capabilities module is loaded, you're done.  It will not let you stack
another module on top.  I can see the point to this, but its
frustrating and makes LSM much less useful for our cause.

Fedora Core 2 comes with SELinux and Capabilities build directly into
the kernel.

Arg.

Is the LSM still worth looking at?  Would it prove to other folks that
we are trying in good faith to work toward a better system?  I think the
answer to both questions is the same.  Hell, I'd run the code.

Jack Neely

> -D
> * actually it's untrue, we tried anyway, but unless some piece of data
> tied to a process is copied over forks, we're sort of screwed, and there's
> nothing we can "borrow" or "use" which we'd not either break some other
> meaning of, or not work the way we want.
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
> 

-- 
Jack Neely <slack@quackmaster.net>
Realm Linux Administration and Development
PAMS Computer Operations at NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89