[OpenAFS-devel] question: binary interface to kernel module (RHEL6.2/6.3, openafs 1.6.1)?
Stephan Wiesand
stephan.wiesand@desy.de
Thu, 30 Aug 2012 15:56:44 +0200
On Aug 29, 2012, at 21:11 , Simon Wilkinson <simonxwilkinson@gmail.com> =
wrote:
>=20
> On 29 Aug 2012, at 16:21, Stephan Wiesand wrote:
>=20
>> Since SL6, we have we have been using "kABI tracking kmods" for =
installing the OpenAFS kernel module on clients. For full information on =
this mechanism, see =
http://people.redhat.com/jcm/el6/dup/docs/dup_book.pdf . In short, you =
only have to compile and install the module once, and it will be used =
with future kernels as long as it doesn't use parts of the ABI that =
changed.
>>=20
>> Trying this may have been stupid in the first place. If so, happy =
bashing :-)
>=20
> Not trying to bash, but you've encountered the problem that is built =
in to this approach.
>=20
> What RedHat's kABI stuff guarantees is that a small whitelist of =
function signatures will not change across all of the kernels which =
claim to share the same ABI. They got to great lengths to make this the =
case - radically modifying changes that they backport from the mainline =
kernel so those changes can be incorporated without changing their =
guaranteed kABI. Roughly speaking the guarantee is that kernel modules =
built against one GA kernel will work against all kernels in that major =
version. Each minor version may add new functions, but they will never =
change the signatures of existing whitelisted functions, or remove =
whitelisted functions.
>=20
> The critical thing to realise is that this guarantee only applies to =
functions on RedHat's whitelist. If you use non-whitelisted functions, =
you can't rely on the ABI guarantees. These functions may go away, or =
(worse) the number or nature of their arguments between any arbitrary =
kernel release. Because you aren't recompiling for each release, you =
won't notice the change in arguments and so will quite possibly end up =
calling something that expects a struct inode with a struct nameidata, =
or something that needs 5 arguments with 2, and so on.
>=20
> Needless to say, OpenAFS uses many symbols which aren't on RedHat's =
whitelist. So, you are pretty much sitting on a ticking timebomb, with =
quite significant data integrity ramifications. To be honest, I'm =
surprised it has taken so long for things to break,
>=20
>> Thanks a lot in advance for any insights.
>=20
> If you want to track this down, I'd advise building a list of all of =
the symbols that OpenAFS uses that aren't on RedHat's whitelist for EL6 =
(from memory, this is a fairly considerable set), and then look at =
whether the function signatures, or structure definitions used by those =
function signatures have changed with the new kernel revision. That =
should give you an idea of where to look.
>=20
> However, I suspect that you're going to continue to encounter this =
problem. Unless someone with a support contract can convince RedHat to =
include all of the symbols OpenAFS requires in their whitelist, it just =
isn't safe to use the kABI stuff for OpenAFS.
That was obviously true for EL5, where I never would have considered =
this approach. But as of EL6, the list of ABI hashes published and =
checked includes the ones that aren't on the whitelist, and is supposed =
to be complete. And none of those interfaces used by the openafs module =
changed.
Sure, non-whitelisted symbols may change, and the afs module uses many =
of them. But such changes will still be reflected in the "provides" from =
the kernel package, no longer match the "requires" from the module =
package, and weak-modules will not link a module into =
/lib/modules/<release>/weak-updates of a kernel with an incompatible =
ABI. I tested this with modules built against the EL6 beta kernel and =
the EL6 GA one. =46rom beta to GA, such changes affecting openafs =
happened. When both module packages were coinstalled, each module ended =
up in weak-updates/ of all compatible kernels - only.
We anticipated such changes, that's why we deviate from the "standard" =
by identifying the kernel the module was built against in the packages =
release. Even though we don't have any mechanism to do it easily, this =
makes it possible to have modules installed for each kernel on the =
system.
What we had really hoped would not happen is a change affecting the afs =
module that's not reflected in the (supposedly complete) list of =
interface hashes.
Cheers,
Stephan
--=20
Stephan Wiesand
DESY - DV -
Platanenallee 6
15732 Zeuthen, Germany