6. LKM - Base Kernel Compatibility

6.1. An LKM Must Match The Base Kernel

The designers of loadable kernel modules realized there would be a problem with having the kernel in multiple files, possibly distributed independently of one another. What if the LKM mydriver.o was written and compiled to work with the Linux 1.2.1 base kernel, and then someone tried to load it into a Linux 1.2.2 kernel? What if there was a change between 1.2.1 and 1.2.2 in the way a kernel subroutine that mydriver.o calls works? These are internal kernel subroutines, so what's to stop them from changing from one release to the next? You could end up with a broken kernel.

To address this problem, the creators of LKMs endowed them with a kernel version number. The special .modinfo section of the mydriver.o object file in this example has "1.2.1" in it because it was compiled using header files from Linux 1.2.1. Try to load it into a 1.2.2 kernel and insmod notices the mismatch and fails, telling you you have a kernel version mismatch.

But wait. What's the chance that there really is an incompatibility between Linux 1.2.1 and 1.2.2 that will affect mydriver.o? mydriver.o only calls a few subroutines and accesses a few data structures. Surely they don't change with every minor release. Must we recompile every LKM against the header files for the particular kernel into which we want to insert it?

To ease this burden, insmod has a -f option that "forces" insmod to ignore the kernel version mismatch and insert the module anyway. Because it is so unusual for there to be a significant difference between any two kernel versions, I recommend you always use -f. You will, however, still get a warning message about the mismatch. There's no way to shut that off.

But LKM designers still wanted to address the problem of incompatible changes that do occasionally happen. So they invented a very clever way to allow the LKM insertion process to be sensitive to the actual content of each kernel subroutine the LKM uses. It's called symbol versioning (or sometimes less clearly, "module versioning."). It's optional, and you select it when you configure the kernel via the "CONFIG_MODVERSIONS" kernel configuration option.

When you build a base kernel or LKM with symbol versioning, the various symbols exported for use by LKMs get defined as macros. The definition of the macro is the same symbol name plus a hexadecimal checksum of the actual source code for the subroutine named by the symbol. So let's look at the register_chrdev subroutine. register_chrdev is a subroutine in the base kernel that device driver LKMs often call. With symbol versioning, there is a C macro definition like

  #define register_chrdev register_chrdev_Rc8dc8350

This macro definition is in effect both in the C source file that defines register_chrdev and in any C source file that refers to register_chrdev, so while your eyes see register_chrdev as you read the code, the C preprocessor knows that the function is really called register_chrdev_Rc8dc8350.

What is the meaning of that garbage suffix? It is a checksum of the actual C source code of the function in question. I.e. if you change even one character of that source code, this suffix changes.

So let's say someone changes the parameter list of register_chrdev between Linux 1.2.1 and Linux 1.2.2. In 1.2.1, register_chrdev is a macro for register_chrdev_Rc8dc8350, but in 1.2.2, it is a macro for register_chrdev_R12f8dc01. In mydriver.o, compiled with Linux 1.2.1 header files, there is an external reference to register_chrdev_Rc8dc8350, but there is no such symbol exported by the 1.2.2 base kernel. Instead, the 1.2.2 base kernel exports a symbol register_chrdev_R12f8dc01.

So if you try to insmod this 1.2.1 mydriver.o into this 1.2.2 base kernel, you will fail. And the error message isn't one about mismatched kernel versions, but simply "unresolved symbol reference."

As clever as this is, it actually works against you much more than it works for you. Here's why: As a practical matter, kernel developers simply can't change the interfaces between LKMs and the rest of the kernel in ways that aren't backward compatible. As much as they may try to reserve that privilege for themselves by declaring there to be no promise of forward compatibility, in the cold light of day, they would cause too much pain in the world by exercising it. So they might do it sometimes when they feel they have no other choice, but it is extremely rare. However, even a backward compatible change -- even a change to a comment -- changes the checksum in the symbol and prevents the LKM from being inserted.

And there's no way an option like -f on insmod can get around this.

So it is generally not wise to use symbol versioning.

Of course, if you have a base kernel that was compiled with symbol versioning, then you must have all your LKMs compiled likewise, and vice versa. Otherwise, you're guaranteed to get those "unresolved symbol reference" errors.

6.2. If You Run Multiple Kernels

Now that we've seen how you often have different versions of an LKM for different base kernels, the question arises as to what to do about a system that has multiple kernel versions (i.e. you can choose a kernel at boot time). You want to make sure that the LKMs built for Kernel A get inserted when you boot Kernel A, but the LKMs built for Kernel B get inserted when you boot Kernel B.

In particular, whenever you upgrade your kernel, if you're smart, you keep both the new kernel and the old kernel on the system until you're sure the new one works.

The most common way to do this is with the LKM-hunting feature of modprobe. modprobe understands the conventional LKM file organization described in Section 5.5 and loads LKMs from the appropriate subdirectory depending on the kernel that is running.

You set the uname --release value, which is the name of the subdirectory in which modprobe looks, by editing the main kernel makefile when you build the kernel and setting the VERSION, PATCHLEVEL, SUBLEVEL, and EXTRAVERSION variables at the top.