Static Analysis of Linux Kernel & Drivers Using Clang

Introduction

Clang is usually quite straightforward to use, but only for simple C/C++ programs that do not have a complex build process. The Linux kernel, however, is a completely different beast with its own custom build system, Kbuild.

This post demonstrates one (rather hackish) way to apply your Clang static analysis programs to the Linux kernel and Linux drivers, even if the modules are outside of the main Linux source tree.

Getting Started with Linux

I will assume you have a local copy of the Linux kernel at the directory ~/linux/. I’ll also assume you are able to compile the full Linux source without errors — that is crucial for this tutorial. You don’t have to be able to install the kernel, just compile it fully. If you’re having difficulty, check out the myriad of online tutorials here.

Working Example

If you’re familiar with the Kbuild system or compiling Linux at all, you’ll know these commands (executed from the top of the Linux source tree: ~/linux/):

$ make — builds the entire kernel
$ make M=/path/to/module/directory — builds only modules in that directory
$ make path/to/specific/module.ko — builds only that module’s .ko file
$ make path/to/specific/output/file.o — builds only file.c and its dependencies

There are a few more options, but for simplicity, let’s assume you just want to build a single file. Let’s look at compiling ~/linux/mm/mmap.c, the file responsible for creating and managing memory maps and implementing the mmap system call. We can compile just mmap.c by executing:

$ cd ~/linux
$ make mm/mmap.o
  CHK include/linux/version.h
  CHK include/generated/utsrelease.h
  HOSTCC scripts/basic/fixdep
  HOSTCC scripts/basic/docproc
  HOSTCC scripts/basic/hash
  CC arch/x86/kernel/asm-offsets.s
  GEN include/generated/asm-offsets.h
  CALL scripts/checksyscalls.sh
  HOSTCC scripts/genksyms/lex.o
  HOSTLD scripts/genksyms/genksyms
  HOSTCC scripts/mod/sumversion.o
  HOSTLD scripts/mod/modpost
  HOSTCC scripts/selinux/genheaders/genheaders
  CC mm/mmap.o

The file should compile successfully with the above output printed to the terminal. Now, let’s see what commands are actually being run by using make’s Verbose mode:

$ rm ~/linux/mm/mmap.o   #if you run make again, remove the built file
$ make V=1 mm/mmap.o     #note that we use V=1 for verbose mode

Now we see a lot more output, including the specific (and very lengthy) commands that are actually being run. We’re interested in the calls to GCC, because we can easily copy those and use Clang in place of GCC. Typically, the call to GCC is at the end of the output of the make command, such as:

gcc -Wp,-MD,mm/.mmap.o.d  -nostdinc -isystem /usr/lib/gcc/i686-linux-gnu/4.6/include
  -Iarch/x86/include -Iinclude  -include include/generated/autoconf.h -Iubuntu/include
  -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
  -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks
  -O2 -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686
  -mtune=generic -maccumulate-outgoing-args -Wa,-mtune=generic32 -ffreestanding -fstack-protector
  -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare
  -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=1024
  -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls
  -g -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack
  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mmap)"  -D"KBUILD_MODNAME=KBUILD_STR(mmap)"
  -c -o mm/.tmp_mmap.o mm/mmap.c

Replacing GCC with Clang

Now that we have a GCC command that will build our mmap.c file, we can replace it with Clang and use our static analysis program of choice. There are a few small changes we have to make to allow Clang to compile/analyze the mmap.c file properly.

  1. First, in the first line of the GCC command above, change the argument after “-isystem” to the correct Clang include libraries, because Clang cannot use the GCC ones. On my system, this is located at /usr/local/lib/clang/3.4/include.
  2. Next, there are several -I and -include arguments that include extra directories for the Linux build to work. We must fix them to point the right directory by putting the full path in front, like so.   In the above example, there are 4 places we must fix. I’ve added the corrections in red below. Don’t forget to use the correct path, your system might not use ~/linux/, or it might not accept “~” as a shortcut for your home directory.
    1. -I~/linux/arch/x86/include
    2. -I~/linux/include
    3. -include ~/linux/include/generated/autoconf.h
    4. -I~/linux/ubuntu/include
  3. Make sure Clang isn’t optimizing your code (if you don’t want it to) by changing -O2 to -O0.
  4. If you need the output file from the compiler, then change the -o command at the end to something like -o ~/linux/mm/mmap.o.
  5. Replace “gcc” with “clang” — obviously.
  6. Compiling certain modules will add in extra arguments that Clang cannot handle. If those cause Clang to error out, then simply remove them from the command.

I find it’s best/easiest to copy the newly fixed Clang command into a script file for easy modification and running. Here’s the complete contents of the working script:

#!/bin/bash

clang -Wp,-MD,~/linux/mm/.mmap.o.d -nostdinc -isystem /usr/local/lib/clang/3.4/include \
  -I~/linux/arch/x86/include -I~/linux/include -include ~/linux/include/generated/autoconf.h -I~/linux/ubuntu/include \
  -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common \
  -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks \
  -O0 -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 \
  -mtune=generic -maccumulate-outgoing-args -Wa,-mtune=generic32 -ffreestanding -fstack-protector \
  -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare \
  -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=1024 \
  -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls \
  -g -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack \
  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mmap)" -D"KBUILD_MODNAME=KBUILD_STR(mmap)" \
  -c -o ~/linux/mm/mmap.o ~/linux/mm/mmap.c

Then you can easily run the script from anywhere (because you have full directory paths) like so:

$ chmod +x clang_build_mmap.sh
$ ./clang_build_mmap.sh

Clang should compile the file with no issues, producing an output file called mmap.o (if that’s what you specified after -o).

Running Your Clang Plugin Program

Now we just need to modify the above script to call our Clang Plugin. Assuming we want to call the sample PrintFunctionNames plugin, we can do it like so:

#!/bin/bash

clang -Xclang -load -Xclang /path/to/clang/build/directory/lib/PrintFunctionNames.so \
      -Xclang -plugin -Xclang print-fns \
  -Wp,-MD,mm/.mmap.o.d -nostdinc -isystem /usr/local/lib/clang/3.4/include \
  -I~/linux/arch/x86/include -I~/linux/include -include ~/linux/include/generated/autoconf.h -I~/linux/ubuntu/include \
  -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common \
  -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks \
  -O0 -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 \
  -mtune=generic -maccumulate-outgoing-args -Wa,-mtune=generic32 -ffreestanding -fstack-protector \
  -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare \
  -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=1024 \
  -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls \
  -g -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack \
  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mmap)" -D"KBUILD_MODNAME=KBUILD_STR(mmap)" \
  -c -o ~/linux/mm/mmap.o ~/linux/mm/mmap.c

This will print all the function names in the mmap.c source file. Note that you must change the /path/to/clang/build/directory/ above in order for things to work.

See my Clang Plugin Tutorial for more information on building and running Clang Plugins.

Running Your Clang LibTooling Program

If you wish to run a Clang LibTooling program on the kernel code instead of a Clang Plugin, the command is slightly different. Basically, you invoke your LibTooling program executable and then add all of the GCC commands after a “--“, as shown below (bonus: with multiple source files!):

#!/bin/bash

/path/to/clang/build/directory/bin/your-libtooling-executable \
    source_file1.c \
    source_file2.c \
    source_file3.c \
    -- \
  -Wp,-MD,mm/.mmap.o.d -nostdinc -isystem /usr/local/lib/clang/3.4/include \
  -I~/linux/arch/x86/include -I~/linux/include -include ~/linux/include/generated/autoconf.h -I~/linux/ubuntu/include \
  -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common \
  -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks \
  -O0 -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 \
  -mtune=generic -maccumulate-outgoing-args -Wa,-mtune=generic32 -ffreestanding -fstack-protector \
  -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare \
  -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=1024 \
  -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls \
  -g -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack \
  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mmap)" -D"KBUILD_MODNAME=KBUILD_STR(mmap)" \
  -c

In the above example, the source files all must be from the same module or directory, such as ~/linux/mm/.  If you do use analyze multiple source files at once (like above), don’t forget to remove the -o part at the end of the script. Also, don’t forget to fix the first line of the command to point to your LibTooling executable.

See my Clang LibTooling Tutorial for more information on building and running Clang LibTooling programs.

Conclusion

This post explained how to execute Clang static analysis on Linux kernel source code. If you have any questions or requests, feel free to leave them below.

Check out the rest of my Clang posts in the blog archive.

Advertisements

One comment

  1. Thanks for the post. setting the HOSTCC and CC to clang while compiling a particular module helped me to compile a module with clang without making any -I include changes:

    eg. make V=1 CC=clang HOSTCC=clang M=fs/ext3

    is all I had to do to build the ext3 module using clang (other than changes in .config file, where I had to enable ext3 loadable module support

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s