Skip to main content

Why your bpftrace programs should not include kernel headers.

· 6 min read
Viktor Malik
Software Engineer

Imagine you write a bpftrace program which needs to access a data structure of some kernel data type, say struct task_struct. In order to generate correct offsets for accessing the struct fields, bpftrace needs to know the layout of the type on the running kernel. Historically, this could be achieved by providing the correct kernel headers to the program using the #include directive. With the coming of BTF (BPF Type Information), this is no longer necessary as bpftrace is able to automatically extract the types layout from BTF. Therefore, for a vast majority of use-cases, including headers is not only unnecessary, but can also lead to unexpected problems and should be avoided, if possible. In this blog post, we will look into the reasons why that is the case and show that the less headers a bpftrace program includes, the more portable it is across kernel versions.

The #include directive

The #include directive in bpftrace works similarly to C - it copies the contents of the included header (and recursively of all the headers that it includes). Then, it runs Clang to parse the obtained code and extracts type and enum definitions for the relevant types. This gives users a convenient and natural way to provide layout of the types used by the script to bpftrace. Since bpftrace is intended for both kernel and userspace tracing, the included headers are searched in the standard system paths as well as in the include directories of the running kernel.

Limitations

While the #include directive is a powerful mechanism, it has its problems, especially in the kernel. Let us look at the most important ones:

  1. Types defined in source directories. Some types in the kernel are not defined in the standard include directories. Instead, they are either defined in “internal” headers located next to the source files or directly in the source files themselves. In both cases, bpftrace doesn’t know how to find such types so, if the script works with them, the only option is to embed them directly in the script. This is, however, error-prone, maintenance-heavy, and not very portable as the type layout can vary between kernel versions. A good example is the runqlen.bt tool from the bpftrace repo which contains an embedded definition of struct cfs_rq (from an internal kernel header kernel/sched/sched.h). We even need to maintain another version of the tool for usage with kernels older than 6.14 since the layout of the type changed since that version.
  2. Types being moved between kernel headers. In some cases, a kernel patch may cause bpftrace #includes to stop working, if a type is moved into a different header. In such a situation, you need to maintain multiple versions of your script for different kernel versions.

BTF: a better way to work with types

There exists a solution to overcome the above mentioned problems and that solution is called BTF (BPF Type Format). In short, BTF is a kind of compact debugging information which (among other things) contains definitions of all kernel types. Thanks to its small size, it can be embedded directly in the kernel (as opposed to DWARF) so most modern distros ship BTF by default these days.

bpftrace automatically reads BTF of the running kernel and uses it to resolve kernel types. Therefore, if the script operates on kernel types only, it is not necessary to use the #include directive at all - the layout of all types will be deduced from BTF. This not only allows you to simplify the bpftrace script but also makes it more portable - correct types from the running kernel will always be used, no matter where in the kernel they are defined.

So, can we just drop all #include statements?

In most cases yes, but not always. There are situations when you still need to include headers and we will look into them in this section.

Information not in BTF

There still remains some information defined in header files but not present in BTF. Probably the best example is constants defined via the macro #define directive. If you want your bpftrace script to use the macro name instead of its numerical value, you either need to include the appropriate header or redefine the macro within the script (yes, bpftrace supports the #define directive).

Enum types

At the moment, bpftrace doesn’t support extracting enum types from BTF, despite the fact that they are there. This is a limitation of bpftrace which is currently being worked on. If you need to use enum values in the meantime, you need to include the appropriate headers or define the constants manually.

Userspace types

Once your bpftrace script uses userspace types, BTF will not help - userspace types are, naturally, not included in the kernel BTF. Good news is that bpftrace has other ways to help you. For one, if the traced application contains debug info (in DWARF format), bpftrace can read it and extract the type layout from it and you don’t need to include any headers. Another nice feature is that types from included headers, BTF, and DWARF can be used simultaneously, provided they do not conflict. If they do conflict, only the types from headers are used and BTF/DWARF is ignored. This usually happens when including system headers from the sys/ directory which often define userspace variants of internal kernel types.

Type conflicts

There may be situations (such as one of the above) when including headers is unavoidable. Then, it may happen that some included type definition is conflicting with a definition taken from BTF. For such a case, bpftrace will disable BTF and only rely on types from the included headers so you will need to include everything necessary.

Conclusion

Putting it all together, the general recommendation is to completely avoid including headers, if possible and let bpftrace extract the types from BTF. When tracing the kernel, start with no headers and only add includes if bpftrace fails to parse your script. For userspace tracing, only include the minimal amount of userspace headers (if your script works with userspace types) and try avoid adding includes from the kernel. Following this advice will let bpftrace leverage BTF as much as possible, which will make your script shorter and more portable across kernel versions.