[OpenAFS] Problem building openafs on kernel 2.6.18.2-34-default

Marcus Watts mdw@umich.edu
Sat, 30 Dec 2006 02:43:28 -0500


> Marcus Watts <mdw@umich.edu> writes:
> 
> > Russ is right this indicates a problem with the kernel build process.
> > Most likely that's headers, but it could be other stuff.
> 
> > Unfortunately, config.log most likely won't contain anything of interest.
> > To get useful stuff in config.log, you'll need to do something like
> > this patch:
> > 	/afs/umich.edu/group/itd/build/mdw/openafs/patches/openafs-1.5.8-kconfig.diff
> > which is the companion to this:
> > 	http://www.central.org/rt/Ticket/Display.html?id=40604
> 
> Hm?  I've always found the error messages in config.log.  I've diagnosed
> probably a dozen of these build failures from config.log output; usually
> it's a straightforward compiler error that points to a missing include,
> structure, or the like.  Or is this an issue with a current Linux that
> causes this to stop working?

2.6 uses kbuild, not cc.  The logic for this is in
	AC_TRY_KBUILD26
which is defined in
	src/cf/linux-test1.m4
It should save standard error, so, yes, there is *some* stuff.

The vanilla openafs logic here doesn't explicitly save the command or
the test fragments that failed.  That means it's not always easy to
figure out exactly what broke.  I came up with this patch after I had a
build break and had to figure out why.  Turns out user mode linux is
particularly keen on weird -I logic, like relative path names, and the
error message from the compiler just isn't sufficient.  Compare
that to the situation where a regular compile probe fails - there you
automatically get the failing program, the compile command that failed,
etc -- these are standard features of the built-in configure logic and
supplies all the information necessary to reproduce the problem
standalone.

Since vanilla openafs configure doesn't try doing a vanilla kernel
build before it tests for features, it doesn't actually know if any
kernel build can succeed.  That's why the rlim test is confusing - this
isn't the 1st kernel test that fails.  It's the first one that fails
with no workaround.  Talk about obscure.

Clearly, this is a common problem.  Do we really not want to make it
easier for people who haven't fixed this dozens of times to recognize &
solve the problem?

				-Marcus Watts