[OpenAFS-devel] Kernel module recompilation, stability, etc.
Kuba Ober
kuba@mareimbrium.org
Tue, 29 May 2001 18:02:22 +0200
> > > If you build a 2.4.2-2 module without the extradefs stuff, it will be
> > > broken. Likewise for any kernel with Alan Cox patches. The patch you
> > > submitted undoes the fixes for this instead of fixing the problem
> > > you're having. You need to find out why Makefile.extradefs isn't being
> > > created as an empty file, instead of suppressing it.
> >
> > What goes into extradefs files? The default .rpm module doesn't have this
> > file. I can obviously create it as an empty file, but will it help?
>
> Look at redhat.sh; In the default RPM Makefile.extradefs should just be
> empty until redhat.sh runs.
$
$ls -lR | grep redhat.sh
$rpm -ql openafs-kernel-source | grep redhat.sh
$which redhat.sh
/usr/bin/which: no redhat.sh in
(/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin:/home/kuba/bin)
$
Is redhat.sh created somehow in `automagic' way? I don't have it :-(
> > Another thing is the link-creation brokenness, and this is a valid bug I
> > assume.
>
> I don't remember, I'll have to go dig for your patch.
The search for $$v/asm in src/Makefile
> > > Because you're not defining any of the stuff the Makefile.extradefs
> > > stuff works to define. It's not just there to make the build break, it
> > > does something. If you don't do the something, well, the module won't
> > > match the kernel you're running.
> >
> > How would an empty makefile.extradefs file help here? The module that I
> > obtain by recompiling works just as well (or as bad ;-) as the stock one
> > supplied in .rpm file.
>
> It wouldn't, the point is by removing the inclusion of it, the result of
> redhat.sh doesn't set the relevant conditionals which cause the module to
> be build for the Alan Cox patches.
Ah, so it seems that redhat.sh is missing !
> > I can do this a few times. Sooner or later umount /afs complains about
> > filesystem being busy, and that's it.
>
> There appear to be 2 problems, one with Linux 2.4 and one with AFS, both
> are or seem to be inode leakage of some sort. The 2.4 problem is why it's
> more prevalent with 2.4.
OK, I get it. Makes sense. Two sources of same problem - UGH. Good point.
Is there a chance that 2.4.3 would fix the kernel part of it? Somehow I feel
reluctant to try non-redhat blessed ;-) kernel, although I compile kernels
myself, so maybe it's worth a try...
> > Which basically means that it's damned unsafe to do any administration on
> > the afs server itself, since the client can eventually bring it down :-(
>
> Only if you restart the client. Remember the server doesn't need the
> module at all. The afsd becomes dead problem I've never seen; The afs
> module becomes non-unloadable I've seen but never when I can get any
> information about it. Can you get afsd to become useless while running a
> kernel module built from the tar file? (on 2.4; on Redhat 2.2 kernels the
> module should be the same)
Does klog & friends work without afsd & client module installed? It doesn't
seem to in my case :-(
On the other hand, afsd becoming dead is esoteric. It does happen when you
try to kill it by hand (killall -9 afsd). Then even afsd -shutdown won't
help...
But anyway, afsd *should* be cleanly killable, even if some requests will be
failed that way.
> > > > 4. /afs should be unmount-able as soon as no process is referencing
> > > > the /afs subtree - that's the behaviour of all other filesystems; in
> > > > case there are afs daemons with pending requests, those daemons
> > > > should be signaled (say with HUP) to immediately return with error,
> > > > so that control can be returned to the calling process (via the
> > > > module)
> > >
> > > Yes, and generally this does work, unless a reference is leaked and the
> > > kernel still believes the filesystem is busy, and if anything, that is
> > > your problem.
> >
> > Well, that a bug, and that's everybody (who's involved) problem ;-). I'm
> > trying to put some debugging code right now to see the reference counts
> > and pick the place where it actually leaks.
>
> Ok. Be also aware of the fstrace interface as it may help and was designed
> for this; My test host hasn't yet leaked though presumably due to my usage
> patterns, and we don't have a good test suite yet. We've been promised one
> this summer.
WIll try fstrace, thanks for the pointer (I was using much more crude method
with placing tons of kernel log entries by hand in all points of interest).
> > > Also, are you building from the RPM source, or a tar file from
> > > www.openafs.org?
> >
> > openafs-kernel-source-1.0.4.rpm
> >
> > I assume that packagers did their job. I can try the .tar as well if
> > that's going to help.
>
> I can tell you the Makefile.extradefs problem doesn't happen with the tar
> file. I build the tar file, Derek Atkins makes a src.rpm, and I build his
> src.rpm for Redhat 7+. The src.rpm->i386.rpm step works correctly; The
> kernel-source i386.rpm way apparently is still broken.
Might be.
Anyway, are Alan Cox's patches being integrated into the main kernel tree, or
is it some kind of `parallel-universe' trunk?
Cheerz,
Kuba