[OpenAFS-devel] OpenAFS Development

Derrick J Brashear shadow@dementia.org
Mon, 28 Jun 2004 15:45:28 -0400 (EDT)


On Sun, 27 Jun 2004, Jack Neely wrote:

> Well, I agree.  We do need to move to a cleaner system to implement
> pags.  However, OpenAFS is mission critical to me and while we work
> toward a new system of handling pags we've got to have something else to
> fall back on.  Not having a fully functional client for 2.6 is soon to
> become not an option for me.  I like the idea that the Arla folks have
> used, first try a better system for pags, if that fails, fall back on
> hooking the sys_call_table.

We had an idea that doing that would be the next step in the war over
sys_call_table... for Linux 2.4. I'd still prefer not to do it for Linux
2.6, but that's a personaal preference, my own only.

> Looking through the source I did notice a few things.  The configure
> script when given the --enable-redhat-buildsys option tries to test to
> see if certain symbols are exported.  This fails on 2.6 because
> configure uses gcc to build a kernel module instead of the kbuild
> system.  I'm not sure how you would teach autoconf to do that.  But,
> that's no biggie, no useful symbols for sys calls are exported anyway.
> Also, 'linux/syscall.h' should be 'linux/syscalls.h' but again, no
> useful symbols are exported.

Right. And I'm not sure how the kbuild stuff could be integrated into
configure either.

> > > Finally, NPTL.  We are starting to look at deploying OpenAFS servers to
> > > replace Transarc code.  What's involved in looking at the NPTL issues in
> > > OpenAFS?  These need to be fixed, not worked around.  The workaround
> > > will probably go away soon.  One of my next tasks is to look into this
> > > more closely.
> >
> > I'm not aware of anyone who knows yet. What would make it easier would be
> > if gdb had something like Solaris dbx's "thread -blockedby", but failing
> > that the suggestion I made the other day (when it hangs, use gdb's
> > generate-core-file and then get backtraces from all threads) would seem to
> > be a start. Alternately, knowing how to reproduce beyond "let it run for a
> > while".
>
> I think I can replicate it...sorta.  I run into this when I move a large
> volume from linux/openafs to a solaris/transarc server.  I'll see if I
> can get some dumps.  It just...takes a beating then maybe it will hang
> tomorrow.

Ok. Well, if that's true I do have an idea.

> capabilities module is loaded, you're done.  It will not let you stack
> another module on top.  I can see the point to this, but its
> frustrating and makes LSM much less useful for our cause.

Right.

> Is the LSM still worth looking at?  Would it prove to other folks that
> we are trying in good faith to work toward a better system?  I think the
> answer to both questions is the same.  Hell, I'd run the code.

Well, the solution might be to pass patches for SELinux ana Capabilities
modules to not deflect us. I'm not sure if that breaks their model though.