[OpenAFS-devel] Kernel module recompilation, stability, etc.

Derrick J Brashear shadow@dementia.org
Tue, 29 May 2001 11:07:54 -0400 (EDT)


On Tue, 29 May 2001, Kuba Ober wrote:

> > If you build a 2.4.2-2 module without the extradefs stuff, it will be
> > broken. Likewise for any kernel with Alan Cox patches. The patch you
> > submitted undoes the fixes for this instead of fixing the problem you're
> > having. You need to find out why Makefile.extradefs isn't being created as
> > an empty file, instead of suppressing it.
> 
> What goes into extradefs files? The default .rpm module doesn't have this 
> file. I can obviously create it as an empty file, but will it help?

Look at redhat.sh; In the default RPM Makefile.extradefs should just be
empty until redhat.sh runs.

> Another thing is the link-creation brokenness, and this is a valid bug I 
> assume.

I don't remember, I'll have to go dig for your patch.

> > Because you're not defining any of the stuff the Makefile.extradefs stuff
> > works to define. It's not just there to make the build break, it does
> > something. If you don't do the something, well, the module won't match the
> > kernel you're running.
> 
> How would an empty makefile.extradefs file help here? The module that I 
> obtain by recompiling works just as well (or as bad ;-) as the stock one 
> supplied in .rpm file.

It wouldn't, the point is by removing the inclusion of it, the result of
redhat.sh doesn't set the relevant conditionals which cause the module to
be build for the Alan Cox patches.

> I can do this a few times. Sooner or later umount /afs complains about 
> filesystem being busy, and that's it.

There appear to be 2 problems, one with Linux 2.4 and one with AFS, both
are or seem to be inode leakage of some sort. The 2.4 problem is why it's
more prevalent with 2.4.

> > There is some problem with inode references being leaked, apparently, but
> > it only happens sometimes. [snip]
> 
> Which basically means that it's damned unsafe to do any administration on the 
> afs server itself, since the client can eventually bring it down :-(

Only if you restart the client. Remember the server doesn't need the
module at all. The afsd becomes dead problem I've never seen; The afs
module becomes non-unloadable I've seen but never when I can get any
information about it. Can you get afsd to become useless while running a
kernel module built from the tar file? (on 2.4; on Redhat 2.2 kernels the
module should be the same)

> > > 4. /afs should be unmount-able as soon as no process is referencing the
> > > /afs subtree - that's the behaviour of all other filesystems; in case
> > > there are afs daemons with pending requests, those daemons should be
> > > signaled (say with HUP) to immediately return with error, so that control
> > > can be returned to the calling process (via the module)
> >
> > Yes, and generally this does work, unless a reference is leaked and the
> > kernel still believes the filesystem is busy, and if anything, that is
> > your problem.
> 
> Well, that a bug, and that's everybody (who's involved) problem ;-). I'm 
> trying to put some debugging code right now to see the reference counts and 
> pick the place where it actually leaks.

Ok. Be also aware of the fstrace interface as it may help and was designed
for this; My test host hasn't yet leaked though presumably due to my usage
patterns, and we don't have a good test suite yet. We've been promised one
this summer.

> > Also, are you building from the RPM source, or a tar file from
> > www.openafs.org?
> 
> openafs-kernel-source-1.0.4.rpm
> 
> I assume that packagers did their job. I can try the .tar as well if that's 
> going to help.

I can tell you the Makefile.extradefs problem doesn't happen with the tar
file. I build the tar file, Derek Atkins makes a src.rpm, and I build his
src.rpm for Redhat 7+. The src.rpm->i386.rpm step works correctly; The
kernel-source i386.rpm way apparently is still broken. 

-D