Synopsis
bpftrace [OPTIONS] FILENAME
bpftrace [OPTIONS] -e 'program code'
When FILENAME is "-", bpftrace will read program code from stdin.
A program will continue running until Ctrl-C is hit, or an exit
function is called. When a program exits, all populated maps are printed (more details below).
Description
bpftrace is a high-level tracing language for Linux. bpftrace uses LLVM as a backend to compile scripts to eBPF-bytecode and makes use of libbpf and bcc for interacting with the Linux BPF subsystem, as well as existing Linux tracing capabilities.
Examples
- Trace processes calling sleep
# bpftrace -e 'kprobe:do_nanosleep { printf("%d sleeping\n", pid); }'
- Trace processes calling sleep while spawning
sleep 5
as a child process
# bpftrace -e 'kprobe:do_nanosleep { printf("%d sleeping\n", pid); }' -c 'sleep 5'
- List all probes with "sleep" in their name
# bpftrace -l '*sleep*'
- List all the probes attached in the program
# bpftrace -l -e 'kprobe:do_nanosleep { printf("%d sleeping\n", pid); }'
Options
-B MODE
Set the buffer mode for stdout.
- Valid values are
none No buffering. Each I/O is written as soon as possible
line Data is written on the first newline or when the buffer is full. This is the default mode.
full Data is written once the buffer is full.
-c COMMAND
Run COMMAND as a child process. When the child terminates bpftrace will also terminate, as if 'exit()' had been called. If bpftrace terminates before the child process does the child process will be terminated with a SIGTERM. If used, 'USDT' probes will only be attached to the child process. To avoid a race condition when using 'USDTs', the child is stopped after 'execve' using 'ptrace(2)' and continued when all 'USDT' probes are attached. The child process runs with the same privileges as bpftrace itself (usually root).
Unless otherwise specified, bpftrace does not perform any implicit filtering. Therefore, if you are only interested in events in COMMAND, you may want to filter based on the child PID. The child PID is available to programs as the 'cpid' builtin. For example, you could add the predicate /pid == cpid/
to probes with userspace context.
-d STAGE
Enable debug mode. For more details see the Debug Output section.
--dry-run
Terminate execution right after attaching all the probes. Useful for testing that the script can be parsed, loaded, and attached, without actually running it.
-e PROGRAM
Execute PROGRAM instead of reading the program from a file or stdin.
-f FORMAT
Set the output format.
- Valid values are
json
text
The JSON output is compatible with NDJSON and JSON Lines, meaning each line of the streamed output is a single blob of valid JSON.
-h, --help
Print the help summary.
-I DIR
Add the directory DIR to the search path for C headers. This option can be used multiple times. For more details see the Preprocessor Options section.
--include FILENAME
Add FILENAME as an include for the pre-processor. This is equal to adding '#include FILENAME' at the top of the program. This option can be used multiple times. For more details see the Preprocessor Options section.
--info
Print detailed information about features supported by the kernel and the bpftrace build.
-k
This flag enables runtime warnings for errors from 'probe_read_*' and map lookup BPF helpers. When errors occur bpftrace will log an error containing the source location and the error code:
stdin:48-57: WARNING: Failed to probe_read_user_str: Bad address (-14) u:lib.so:"fn(char const*)" { printf("arg0:%s\n", str(arg0));} ~~~~~~~~~
-l [SEARCH|FILENAME]
List all probes that match the SEARCH pattern. If the pattern is omitted all probes will be listed. This pattern supports wildcards in the same way that probes do. E.g. '-l kprobe:*file*' to list all 'kprobes' with 'file' in the name. This can be used with a program, which will list all probes in that program. For more details see the Listing Probes section.
--no-feature feature,feature,…
- Disable use of detected features, valid values are
uprobe_multi to disable uprobe_multi link
kprobe_multi to disable kprobe_multi link
kprobe_session to disable automatic collapse of kprobe/kretprobe into kprobe session
--no-warnings
Suppress all warning messages created by bpftrace.
-o FILENAME
Write bpftrace tracing output to FILENAME instead of stdout. This doesn’t include child process (-c option) output. Errors are still written to stderr.
-p PID
Attach to the process with or filter actions by PID. If the process terminates, bpftrace will also terminate. When using USDT, uprobes, uretprobes, hardware, software, profile, interval, watchpoint, or asyncwatchpoint probes they will be attached to only this process. For all other probes, except BEGIN/END, the pid will act like a predicate to filter out events not from that pid. For listing uprobes/uretprobes set the target to '*' and the process’s address space will be searched for the symbols.
-q
Keep messages quiet.
--unsafe
Some calls, like 'system', are marked as unsafe as they can have dangerous side effects ('system("rm -rf")') and are disabled by default. This flag allows their use.
--usdt-file-activation
Activate usdt semaphores based on file path.
-V, --version
Print bpftrace version information.
-v
Enable verbose messages. For more details see the Verbose Output section.
Program Options
You can also pass custom options to a bpftrace program/script itself via positional or named parameters. Positional parameters can be placed before or after a double dash but named parameters can ONLY come after a double dash; e.g.
# bpftrace -e 'BEGIN { print(($1, $2, getopt("aa", 1), getopt("bb"))); }' p1 -- --aa=20 --bb p2 // (p1, p2, 20, true) is printed
or
# bpftrace myscript.bt -- p1 --aa=20 --bb p2
`}
In these examples there are two positional parameters (p1
, p2
) and two named parameters (aa
, which is set to 20
, and bb
, which is set to true
). Named program parameters require the =
to set their value unless they are boolean parameters (like 'bb' above). Read about how to access positional or named parameters in a bpftrace script below.
The Language
Syntax, types, and concepts for bpftrace are available here.
Probes
bpftrace supports various probe types which allow the user to attach BPF programs to different types of events. Each probe starts with a provider (e.g. kprobe
) followed by a colon (:
) separated list of options. The amount of options and their meaning depend on the provider.Full list of probe types.
Standard library
The standard library of all builtins, functions, and map functions is available here.
Configuration
Config Variables
Some behavior can only be controlled through config variables, which are listed here. These can be set via the Config Block directly in a script (before any probes) or via their environment variable equivalent, which is upper case and includes the `BPFTRACE_` prefix e.g. ``stack_mode``'s environment variable would be `BPFTRACE_STACK_MODE`.
Environment Variables
These are not available as part of the standard set of Config Variables and can only be set as environment variables.
BPFTRACE_BTF
Default: None
The path to a BTF file. By default, bpftrace searches several locations to find a BTF file. See src/btf.cpp for the details.
BPFTRACE_DEBUG_OUTPUT
Default: 0
Outputs bpftrace’s runtime debug messages to the trace_pipe. This feature can be turned on by setting the value of this environment variable to 1
.
BPFTRACE_KERNEL_BUILD
Default: /lib/modules/$(uname -r)
Only used with BPFTRACE_KERNEL_SOURCE
if it is out-of-tree Linux kernel build.
BPFTRACE_KERNEL_SOURCE
Default: /lib/modules/$(uname -r)
bpftrace requires kernel headers for certain features, which are searched for in this directory.
BPFTRACE_VMLINUX
Default: None
This specifies the vmlinux path used for kernel symbol resolution when attaching kprobe to offset. If this value is not given, bpftrace searches vmlinux from pre defined locations. See src/attached_probe.cpp:find_vmlinux() for details.
BPFTRACE_COLOR
Default: auto
Colorize the bpftrace log output message. Valid values are auto, always and never.
Options Expanded
Debug Output
The -d STAGE
option produces debug output. It prints the output of the bpftrace execution stage given by the STAGE argument. The option can be used multiple times (with different stage names) and the special value all
prints the output of all the supported stages. The option also takes multiple stages in one invocation as comma separated values.
Note: This is primarily used for bpftrace developers.
The supported options are:
| Prints the Abstract Syntax Tree (AST) after every pass. |
| Prints the unoptimized LLVM IR as produced by |
| Prints the optimized LLVM IR, i.e. the code which will be compiled into BPF bytecode. |
| Disassembles and prints out the generated bytecode that |
| Captures and prints libbpf log for all libbpf operations that bpftrace uses. |
| Captures and prints the BPF verifier log. |
| Prints the output of all of the above stages. |
Listing Probes
Probe listing is the method to discover which probes are supported by the current system. Listing supports the same syntax as normal attachment does and alternatively can be combined with -e
or filename args to see all the probes that a program would attach to.
# bpftrace -l 'kprobe:*' # bpftrace -l 't:syscalls:*openat* # bpftrace -l 'kprobe:tcp*,trace # bpftrace -l 'k:*socket*,tracepoint:syscalls:*tcp*' # bpftrace -l -e 'tracepoint:xdp:mem_* { exit(); }' # bpftrace -l my_script.bt # bpftrace -lv 'enum cpu_usage_stat'
The verbose flag (-v
) can be specified to inspect arguments (args
) for providers that support it:
# bpftrace -l 'fexit:tcp_reset,tracepoint:syscalls:sys_enter_openat' -v fexit:tcp_reset struct sock * sk struct sk_buff * skb tracepoint:syscalls:sys_enter_openat int __syscall_nr int dfd const char * filename int flags umode_t mode # bpftrace -l 'uprobe:/bin/bash:rl_set_prompt' -v # works only if /bin/bash has DWARF uprobe:/bin/bash:rl_set_prompt const char *prompt # bpftrace -lv 'struct css_task_iter' struct css_task_iter { struct cgroup_subsys *ss; unsigned int flags; struct list_head *cset_pos; struct list_head *cset_head; struct list_head *tcset_pos; struct list_head *tcset_head; struct list_head *task_pos; struct list_head *cur_tasks_head; struct css_set *cur_cset; struct css_set *cur_dcset; struct task_struct *cur_task; struct list_head iters_node; };
Preprocessor Options
The -I
option can be used to add directories to the list of directories that bpftrace uses to look for headers. Can be defined multiple times.
# cat program.bt #include <foo.h> BEGIN { @ = FOO } # bpftrace program.bt definitions.h:1:10: fatal error: 'foo.h' file not found # /tmp/include foo.h # bpftrace -I /tmp/include program.bt Attached 1 probe
The --include
option can be used to include headers by default. Can be defined multiple times. Headers are included in the order they are defined, and they are included before any other include in the program being executed.
# bpftrace --include linux/path.h --include linux/dcache.h -e 'kprobe:vfs_open { printf("open path: %s\n", str(((struct path *)arg0)->dentry->d_name.name)); }' Attached 1 probe open path: .com.google.Chrome.ASsbu2 open path: .com.google.Chrome.gimc10 open path: .com.google.Chrome.R1234s
Verbose Output
The -v
option prints more information about the program as it is run:
# bpftrace -v -e 'tracepoint:syscalls:sys_enter_nanosleep { printf("%s is sleeping.\n", comm); }' AST node count: 7 Attached 1 probe load tracepoint:syscalls:sys_enter_nanosleep, with BTF, with func_infos: Success Program ID: 111 Attaching tracepoint:syscalls:sys_enter_nanosleep iscsid is sleeping. iscsid is sleeping. [...]
Terminology
BPF | Berkeley Packet Filter: a kernel technology originally developed for optimizing the processing of packet filters (eg, tcpdump expressions). |
BPF map | A BPF memory object, which is used by bpftrace to create many higher-level objects. |
BTF | BPF Type Format: the metadata format which encodes the debug info related to BPF program/map. |
dynamic tracing | Also known as dynamic instrumentation, this is a technology that can instrument any software event, such as function calls and returns, by live modification of instruction text. Target software usually does not need special capabilities to support dynamic tracing, other than a symbol table that bpftrace can read. Since this instruments all software text, it is not considered a stable API, and the target functions may not be documented outside of their source code. |
eBPF | Enhanced BPF: a kernel technology that extends BPF so that it can execute more generic programs on any events, such as the bpftrace programs listed below. It makes use of the BPF sandboxed virtual machine environment. Also note that eBPF is often just referred to as BPF. |
kprobes | A Linux kernel technology for providing dynamic tracing of kernel functions. |
probe | An instrumentation point in software or hardware, that generates events that can execute bpftrace programs. |
static tracing | Hard-coded instrumentation points in code. Since these are fixed, they may be provided as part of a stable API, and documented. |
tracepoints | A Linux kernel technology for providing static tracing. |
uprobes | A Linux kernel technology for providing dynamic tracing of user-level functions. |
USDT | User Statically-Defined Tracing: static tracing points for user-level software. Some applications support USDT. |
Program Files
Programs saved as files are often called scripts and can be executed by specifying their file name. It is convention to use the .bt
file extension but it is not required.
For example, listing the sleepers.bt file using cat
:
# cat sleepers.bt tracepoint:syscalls:sys_enter_nanosleep { printf("%s is sleeping.\n", comm); }
And then calling it:
# bpftrace sleepers.bt Attached 1 probe iscsid is sleeping. iscsid is sleeping.
It can also be made executable to run stand-alone. Start by adding an interpreter line at the top (#!
) with either the path to your installed bpftrace (/usr/local/bin is the default) or the path to env
(usually just /usr/bin/env
) followed by bpftrace
(so it will find bpftrace in your $PATH
):
#!/usr/local/bin/bpftrace tracepoint:syscalls:sys_enter_nanosleep { printf("%s is sleeping.\n", comm); }
Then make it executable:
# chmod 755 sleepers.bt # ./sleepers.bt Attached 1 probe iscsid is sleeping. iscsid is sleeping.