[OpenAFS-devel] Kernel module recompilation, stability, etc.

Kuba Ober kuba@mareimbrium.org
Tue, 29 May 2001 18:02:22 +0200


> > > If you build a 2.4.2-2 module without the extradefs stuff, it will be
> > > broken. Likewise for any kernel with Alan Cox patches. The patch you
> > > submitted undoes the fixes for this instead of fixing the problem
> > > you're having. You need to find out why Makefile.extradefs isn't being
> > > created as an empty file, instead of suppressing it.
> >
> > What goes into extradefs files? The default .rpm module doesn't have this
> > file. I can obviously create it as an empty file, but will it help?
>
> Look at redhat.sh; In the default RPM Makefile.extradefs should just be
> empty until redhat.sh runs.

$
$ls -lR | grep redhat.sh
$rpm -ql openafs-kernel-source | grep redhat.sh
$which redhat.sh
/usr/bin/which: no redhat.sh in 
(/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin)
$

Is redhat.sh created somehow in `automagic' way? I don't have it :-(

> > Another thing is the link-creation brokenness, and this is a valid bug I
> > assume.
>
> I don't remember, I'll have to go dig for your patch.

The search for $$v/asm in src/Makefile

> > > Because you're not defining any of the stuff the Makefile.extradefs
> > > stuff works to define. It's not just there to make the build break, it
> > > does something. If you don't do the something, well, the module won't
> > > match the kernel you're running.
> >
> > How would an empty makefile.extradefs file help here? The module that I
> > obtain by recompiling works just as well (or as bad ;-) as the stock one
> > supplied in .rpm file.
>
> It wouldn't, the point is by removing the inclusion of it, the result of
> redhat.sh doesn't set the relevant conditionals which cause the module to
> be build for the Alan Cox patches.

Ah, so it seems that redhat.sh is missing !

> > I can do this a few times. Sooner or later umount /afs complains about
> > filesystem being busy, and that's it.
>
> There appear to be 2 problems, one with Linux 2.4 and one with AFS, both
> are or seem to be inode leakage of some sort. The 2.4 problem is why it's
> more prevalent with 2.4.

OK, I get it. Makes sense. Two sources of same problem - UGH. Good point.

Is there a chance that 2.4.3 would fix the kernel part of it? Somehow I feel 
reluctant to try non-redhat blessed ;-) kernel, although I compile kernels 
myself, so maybe it's worth a try...

> > Which basically means that it's damned unsafe to do any administration on
> > the afs server itself, since the client can eventually bring it down :-(
>
> Only if you restart the client. Remember the server doesn't need the
> module at all. The afsd becomes dead problem I've never seen; The afs
> module becomes non-unloadable I've seen but never when I can get any
> information about it. Can you get afsd to become useless while running a
> kernel module built from the tar file? (on 2.4; on Redhat 2.2 kernels the
> module should be the same)

Does klog & friends work without afsd & client module installed? It doesn't 
seem to in my case :-(

On the other hand, afsd becoming dead is esoteric. It does happen when you 
try to kill it by hand (killall -9 afsd). Then even afsd -shutdown won't 
help...

But anyway, afsd *should* be cleanly killable, even if some requests will be 
failed that way.

> > > > 4. /afs should be unmount-able as soon as no process is referencing
> > > > the /afs subtree - that's the behaviour of all other filesystems; in
> > > > case there are afs daemons with pending requests, those daemons
> > > > should be signaled (say with HUP) to immediately return with error,
> > > > so that control can be returned to the calling process (via the
> > > > module)
> > >
> > > Yes, and generally this does work, unless a reference is leaked and the
> > > kernel still believes the filesystem is busy, and if anything, that is
> > > your problem.
> >
> > Well, that a bug, and that's everybody (who's involved) problem ;-). I'm
> > trying to put some debugging code right now to see the reference counts
> > and pick the place where it actually leaks.
>
> Ok. Be also aware of the fstrace interface as it may help and was designed
> for this; My test host hasn't yet leaked though presumably due to my usage
> patterns, and we don't have a good test suite yet. We've been promised one
> this summer.

WIll try fstrace, thanks for the pointer (I was using much more crude method 
with placing tons of kernel log entries by hand in all points of interest).

> > > Also, are you building from the RPM source, or a tar file from
> > > www.openafs.org?
> >
> > openafs-kernel-source-1.0.4.rpm
> >
> > I assume that packagers did their job. I can try the .tar as well if
> > that's going to help.
>
> I can tell you the Makefile.extradefs problem doesn't happen with the tar
> file. I build the tar file, Derek Atkins makes a src.rpm, and I build his
> src.rpm for Redhat 7+. The src.rpm->i386.rpm step works correctly; The
> kernel-source i386.rpm way apparently is still broken.

Might be.

Anyway, are Alan Cox's patches being integrated into the main kernel tree, or 
is it some kind of `parallel-universe' trunk?

Cheerz,

Kuba