On 16/01/19 00:58, Yonghong Song wrote: > This patch added documentation for BTF (BPF Debug Format). > The document is placed under linux:Documentation/bpf directory. > > Signed-off-by: Yonghong Song <y...@fb.com> I like this a lot overall, it does a good job of explaining how the various pieces fit together. See inline for review comments.
> --- > Documentation/bpf/btf.rst | 787 ++++++++++++++++++++++++++++++++++++ > Documentation/bpf/index.rst | 7 + > 2 files changed, 794 insertions(+) > create mode 100644 Documentation/bpf/btf.rst > > diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst > new file mode 100644 > index 000000000000..3dfa8edd22ac > --- /dev/null > +++ b/Documentation/bpf/btf.rst > @@ -0,0 +1,787 @@ > +===================== > +BPF Type Format (BTF) > +===================== > + > +1. Introduction > +*************** > + > +BTF (BPF Type Format) is the meta data format which > +encodes the debug info related to BPF program/map. > +The name BTF was used initially to describe > +data types. The BTF was later extended to include > +function info for defined subroutines, and line info > +for source/line information. > + > +The debug info is used for map pretty print, function > +signature, etc. The function signature enables better > +bpf program/function kernel symbol. > +The line info helps generate > +source annotated translated byte code, jited code > +and verifier log. > + > +The BTF specification contains two parts, > + * BTF kernel API > + * BTF ELF file format > + > +The kernel API is the contract between > +user space and kernel. The kernel verifies > +the BTF info before using it. > +The ELF file format is a user space contract > +between ELF file and libbpf loader. > + > +The type and string sections are part of the > +BTF kernel API, describing the debug info > +(mostly types related) referenced by the bpf program. > +These two sections are discussed in > +details in Section 2. > + > +2. BTF Type/String Encoding > +*************************** > + > +The file ``include/uapi/linux/btf.h`` provides high > +level definition on how types/strings are encoded. > + > +The beginning of data blob must be:: > + > + struct btf_header { > + __u16 magic; > + __u8 version; > + __u8 flags; > + __u32 hdr_len; > + > + /* All offsets are in bytes relative to the end of this header */ > + __u32 type_off; /* offset of type section */ > + __u32 type_len; /* length of type section */ > + __u32 str_off; /* offset of string section */ > + __u32 str_len; /* length of string section */ > + }; > + > +The magic is ``0xeB9F``, which has different encoding for big and little > +endian system, and can be used to test whether BTF is generated for > +big or little endian target. > +The btf_header is designed to be extensible with hdr_len specifying > +the struct btf_header length when the data blob is generated. Should probably specify here whether hdr_len includes the whole header or starts from offsetofend(hdr_len). (I believe it's the whole thing.) > + > +2.1 String Encoding > +=================== > + > +The first byte of string section must be ``'\0'`` to represent a null string. Perhaps "empty string" is more precise than "null string"? > +The rest of string table is a cancatenation of other strings. sp: concatenation. Possibly also state that those other strings are nul-terminated. > + > +2.2 Type Encoding > +================= > + > +The type id ``0`` is reserved for ``void`` type. > +The type section is parsed sequentially and the type id is assigned to > +each recognized type starting from id ``1``. > +Currently, the following types are supported:: > + > + #define BTF_KIND_INT 1 /* Integer */ > + #define BTF_KIND_PTR 2 /* Pointer */ > + #define BTF_KIND_ARRAY 3 /* Array */ > + #define BTF_KIND_STRUCT 4 /* Struct */ > + #define BTF_KIND_UNION 5 /* Union */ > + #define BTF_KIND_ENUM 6 /* Enumeration */ > + #define BTF_KIND_FWD 7 /* Forward */ > + #define BTF_KIND_TYPEDEF 8 /* Typedef */ > + #define BTF_KIND_VOLATILE 9 /* Volatile */ > + #define BTF_KIND_CONST 10 /* Const */ > + #define BTF_KIND_RESTRICT 11 /* Restrict */ > + #define BTF_KIND_FUNC 12 /* Function */ > + #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ > + > +Note that the type section encodes debug info, not just pure types. > +``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. > + > +Each type contains the following common data:: > + > + struct btf_type { > + __u32 name_off; > + /* "info" bits arrangement > + * bits 0-15: vlen (e.g. # of struct's members) > + * bits 16-23: unused > + * bits 24-27: kind (e.g. int, ptr, array...etc) > + * bits 28-30: unused > + * bit 31: kind_flag, currently used by > + * struct, union and fwd > + */ > + __u32 info; > + /* "size" is used by INT, ENUM, STRUCT and UNION. > + * "size" tells the size of the type it is describing. > + * > + * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, > + * FUNC and FUNC_PROTO. > + * "type" is a type_id referring to another type. > + */ > + union { > + __u32 size; > + __u32 type; > + }; > + }; > + > +For certain kinds, the common data are followed by kind specific data. > +The ``name_off`` in ``struct btf_type`` specifies the offset in the string > table. > +The following details encoding of each kind. > + > +2.2.1 BTF_KIND_INT > +~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: any valid offset > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_INT > + * ``info.vlen``: 0 > + * ``size``: the size of the int type in bytes. > + > +``btf_type`` is followed by a ``u32`` with following bits arrangement:: > + > + #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24) > + #define BTF_INT_OFFSET(VAL) (((VAL & 0x00ff0000)) >> 16) > + #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff) > + > +The ``BTF_INT_ENCODING`` has the following attributes:: > + > + #define BTF_INT_SIGNED (1 << 0) > + #define BTF_INT_CHAR (1 << 1) > + #define BTF_INT_BOOL (1 << 2) > + > +The ``BTF_INT_ENCODING()`` provides extra information, signness, > +char, or bool, for the int type. The char and bool encoding > +are mostly useful for pretty print. At most one encoding can > +be specified for the int type. > + > +The ``BTF_INT_OFFSET()`` specifies the starting bit offset to > +calculate values for this int. That really doesn't make clear, at least to me, what this field is for. > Typically it should be 0 and > +currently both llvm and pahole generates ``BTF_INT_OFFSET() = 0``. > + > +The ``BTF_INT_BITS()`` specifies the number of actual bits held by > +this int type. For example, a 4-bit bitfield encodes > +``BTF_INT_BITS()`` equals to 4. The ``btf_type.size * 8`` > +must be equal to or greater than ``BTF_INT_BITS()`` for the type. > +The maximum value of ``BTF_INT_BITS()`` is 128. > + > +2.2.2 BTF_KIND_PTR > +~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_PTR > + * ``info.vlen``: 0 > + * ``type``: the pointee type of the pointer > + > +No additional type data follow ``btf_type``. > + > +2.2.3 BTF_KIND_ARRAY > +~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_ARRAY > + * ``info.vlen``: 0 > + * ``size/type``: 0, not used > + > +btf_type is followed by one "struct btf_array":: > + > + struct btf_array { > + __u32 type; > + __u32 index_type; > + __u32 nelems; > + }; > + > +The ``struct btf_array`` encoding: > + * ``type``: the element type > + * ``index_type``: the index type Is this ever anything but u32? What is the purpose of this field's existence? > + * ``nelems``: the number of elements for this array. > + > +For a multiple dimensional array, e.g., ``a[5][6]``, the btf_array.nelems > +equals ``30``. Does this mean that there is nothing in BTF to distinguish a multi-dimensional array from a single-dimensional array of the same size? Why is this done, rather than chaining BTF_ARRAY records? > ``nelems = 0`` is also allowed. > + > +2.2.4 BTF_KIND_STRUCT > +~~~~~~~~~~~~~~~~~~~~~ > +2.2.5 BTF_KIND_UNION > +~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 or offset to a valid C identifier > + * ``info.kind_flag``: 0 or 1 > + * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION > + * ``info.vlen``: the number of struct/union members > + * ``info.size``: the size of the struct/union in bytes > + > +``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.:: > + > + struct btf_member { > + __u32 name_off; > + __u32 type; > + __u32 offset; > + }; > + > +``struct btf_member`` encoding: > + * ``name_off``: offset to a valid C identifier > + * ``type``: the member type > + * ``offset``: <see below> > + > +If the type info ``kind_flag`` is not set, the offset contains > +only bit offset of the member. Note that the base type of the > +bitfield can only be int or enum type. If the bitfield size > +is 32, the base type can be either int or enum type. > +If the bitfield size is not 32, the base type must be int, > +and int type ``BTF_INT_BITS()`` encodes the bitfield size. > + > +If the ``kind_flag`` is set, the ``btf_member.offset`` > +contains both member bitfield size and bit offset. The > +bitfield size and bit offset are calculated as below.:: > + > + #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24) > + #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff) > + > +In this case, if the base type is an int type, it must > +be a regular int type: > + > + * ``BTF_INT_OFFSET()`` must be 0. > + * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``. Probably worth referencing here the patch that added kind_flag, as that explains why these two different modes exist. > + > +2.2.6 BTF_KIND_ENUM > +~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 or offset to a valid C identifier > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_ENUM > + * ``info.vlen``: number of enum values > + * ``size``: 4 > + > +``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.:: > + > + struct btf_enum { > + __u32 name_off; > + __s32 val; > + }; > + > +The ``btf_enum`` encoding: > + * ``name_off``: offset to a valid C identifier > + * ``val``: any value > + > +2.2.7 BTF_KIND_FWD > +~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: offset to a valid C identifier > + * ``info.kind_flag``: 0 for struct, 1 for union > + * ``info.kind``: BTF_KIND_FWD > + * ``info.vlen``: 0 > + * ``type``: 0 > + > +No additional type data follow ``btf_type``. > + > +2.2.8 BTF_KIND_TYPEDEF > +~~~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: offset to a valid C identifier > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_TYPEDEF > + * ``info.vlen``: 0 > + * ``type``: the type to be redefined This is unclear phrasing. How about: * ``type``: the type to be given a name Because a typedef doesn't 'redefine' ``type``, it defines the _name_ as referring to ``type``. (I realise my phrasing isn't the best either, but I can't figure out how to further improve it.) > + > +No additional type data follow ``btf_type``. > + > +2.2.9 BTF_KIND_VOLATILE > +~~~~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_VOLATILE > + * ``info.vlen``: 0 > + * ``type``: the type having volatile modifier This is again a little bit blurry. ``type`` doesn't "have volatile modifier"; it is rather the type _to which_ a volatile modifier is applied to create the type defined by this record. Maybe something like "the type to be volatile-qualified"? (Note that the C standard refers to const, volatile and restrict as 'qualifiers', not 'modifiers', and we should probably follow that terminology.) > + > +No additional type data follow ``btf_type``. > + > +2.2.10 BTF_KIND_CONST > +~~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_CONST > + * ``info.vlen``: 0 > + * ``type``: the type having const modifier > + > +No additional type data follow ``btf_type``. > + > +2.2.11 BTF_KIND_RESTRICT > +~~~~~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_RESTRICT > + * ``info.vlen``: 0 > + * ``type``: the type having restrict modifier > + > +No additional type data follow ``btf_type``. > + > +2.2.12 BTF_KIND_FUNC > +~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: offset to a valid C identifier > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_FUNC > + * ``info.vlen``: 0 > + * ``type``: a BTF_KIND_FUNC_PROTO type You should put an explanation here of the semantics of this. Above it was mentioned that BTF_KIND_FUNC does not declare a type, but that should be expanded on here to more fully explain the relationship between BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO. Perhaps something like: A BTF_KIND_FUNC defines, not a type, but a subprogram (function) whose signature is defined by ``type``; the subprogram is thus an instance of that type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the `.BTF.ext section`__ (ELF) or in the arguments to BPF_PROG_LOAD__ (ABI). .. __: `4.2 .BTF.ext section`_ .. __: `3.3 BPF_PROG_LOAD`_ > + > +No additional type data follow ``btf_type``. > + > +2.2.13 BTF_KIND_FUNC_PROTO > +~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +``struct btf_type`` encoding requirement: > + * ``name_off``: 0 > + * ``info.kind_flag``: 0 > + * ``info.kind``: BTF_KIND_FUNC_PROTO > + * ``info.vlen``: # of parameters > + * ``type``: the return type > + > +``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.:: > + > + struct btf_param { > + __u32 name_off; > + __u32 type; > + }; > + > +If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, > +then ``btf_param.name_off`` must point to a valid C identifier > +except for the possible last argument representing the variable > +argument. The btf_param.type refers to parameter type. > + > +If the function has the variable arguments, the last parameter s/has the/has/ > +is encoded with ``name_off = 0`` and ``type = 0``. > + > +3. BTF Kernel API > +***************** > + > +The following bpf syscall command involves BTF: > + * BPF_BTF_LOAD: load a blob of BTF data into kernel > + * BPF_MAP_CREATE: map creation with btf key and value type info. > + * BPF_PROG_LOAD: prog load with btf function and line info. > + * BPF_BTF_GET_FD_BY_ID: get a btf fd > + * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info > + and other btf related info are returned. > + > +The workflow typically looks like: > +:: > + > + Application: > + BPF_BTF_LOAD > + | > + v > + BPF_MAP_CREATE & BPF_PROG_LOAD > + | > + V > + ...... > + > + Introspection tool: > + ...... > + | > + V > + BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id) > + | > + V > + BPF_BTF_GET_FD_BY_ID (get btf_fd) > + | > + V > + BPF_OBJ_GET_INFO_BY_FD (get btf) > + | > + V > + pretty print types, dump func signatures and line info, etc. > + > + > +3.1 BPF_BTF_LOAD > +================ > + > +Load a blob of BTF data into kernel. A blob of data > +described in Section 2 can be directly loaded into the kernel. > +A ``btf_fd`` returns to userspace. > + > +3.2 BPF_MAP_CREATE > +================== > + > +A map can be created with ``btf_fd`` and specified key/value type id.:: > + > + __u32 btf_fd; /* fd pointing to a BTF type data */ > + __u32 btf_key_type_id; /* BTF type_id of the key */ > + __u32 btf_value_type_id; /* BTF type_id of the value */ > + > +In libbtf, if the map is specified like below in the bpf program: Should this say libbpf? > +:: > + > + struct bpf_map_def SEC("maps") btf_map = { > + .type = BPF_MAP_TYPE_ARRAY, > + .key_size = sizeof(int), > + .value_size = sizeof(struct ipv_counts), > + .max_entries = 4, > + }; > + BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts); > + > +Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, > +key and value types for the map. > +During ELF parsing, libbpf is able to extract key/value type_id's > +and assigned them to BPF_MAP_CREATE attributes automatically. > + > +3.3 BPF_PROG_LOAD > +================= > + > +During prog_load, func_info and line_info can be passed to kernel with > +proper values for the following attributes: > +:: > + > + __u32 insn_cnt; > + __aligned_u64 insns; > + ...... > + __u32 prog_btf_fd; /* fd pointing to BTF type data */ > + __u32 func_info_rec_size; /* userspace bpf_func_info size > */ > + __aligned_u64 func_info; /* func info */ > + __u32 func_info_cnt; /* number of bpf_func_info records */ > + __u32 line_info_rec_size; /* userspace bpf_line_info size > */ > + __aligned_u64 line_info; /* line info */ > + __u32 line_info_cnt; /* number of bpf_line_info records */ > + > +The func_info and line_info are an array of below, respectively.:: > + > + struct bpf_func_info { > + __u32 insn_off; /* [0, insn_cnt - 1] */ > + __u32 type_id; /* pointing to a BTF_KIND_FUNC type */ > + }; > + struct bpf_line_info { > + __u32 insn_off; /* [0, insn_cnt - 1] */ > + __u32 file_name_off; /* offset to string table for the filename */ > + __u32 line_off; /* offset to string table for the source line */ > + __u32 line_col; /* line number and column number */ > + }; > + > +func_info_rec_size is the size of each func_info record, and > line_info_rec_size > +is the size of each line_info record. Passing the record size to kernel make > +it possible to extend the record itself in the future. > + > +Below are requirements for func_info: > + * func_info[0].insn_off must be 0. > + * the func_info insn_off is in strictly increasing order and matches > + bpf func boundaries. > + > +Below are requirements for line_info: > + * the first insn in each func must points to a line_info record. > + * the line_info insn_off is in strictly increasing order. > + > +For line_info, the line number and column number are defined as below: > +:: > + > + #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10) > + #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff) > + > +3.4 BPF_BTF_GET_FD_BY_ID > +======================== > + > + Given a btf id, a btf fd is returned. > + > +3.5 BPF_OBJ_GET_INFO_BY_FD > +========================== > + > +Users can get btf blob, bpf_map_info and bpf_prog_info. > +bpf_map_info returns btf_id, key/value type id. What exactly is btf_id in this case? The type_id of the BPF_ANNOTATE_KV_PAIR struct? > +bpf_prog_info returns btf_id, func_info and line info > +for translated bpf byte codes, and jited_line_info. In this case presumably btf_id is the type_id of the BTF_KIND_FUNC; perhaps that should be stated explicitly too. > + > +4. ELF File Format Interface > +**************************** > + > +4.1 .BTF section > +================ > + Really you should state what this section is _supposed_ to contain before starting to talk about what existing implementations generate. > +pahole currently generates .BTF section with the same format > +as described in Section 2. pahole doesn't generate > +BTF_KIND_FUNC yet. > + > +llvm generates two sections .BTF and .BTF.ext. > +The .BTF section has the same specification as in Section 2. > +The .BTF.ext section encodes func_info and line_info which > +needs loader manipulation before loading into the kernel. > + > +4.2 .BTF.ext section > +==================== > + > +The specification for .BTF.ext section is defined at > +``tools/lib/bpf/btf.h`` and ``tools/lib/bpf/btf.c``. > + > +The current header of .BTF.ext section:: > + > + struct btf_ext_header { > + __u16 magic; > + __u8 version; > + __u8 flags; > + __u32 hdr_len; > + > + /* All offsets are in bytes relative to the end of this header */ > + __u32 func_info_off; > + __u32 func_info_len; > + __u32 line_info_off; > + __u32 line_info_len; > + }; > + > +It is very similar to .BTF section. Instead of type/string section, > +it contains func_info and line_info section. Perhaps there should be a link back to §3.3 here, as that has the definitions of structs bpf_func_info and bpf_line_info. -Ed > + > +The func_info is organized as below.:: > + > + func_info_rec_size > + btf_ext_info_sec for section #1 /* func_info for section #1 */ > + btf_ext_info_sec for section #2 /* func_info for section #2 */ > + ... > + > +``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure > +when .BTF.ext is generated. btf_ext_info_sec, defined below, is > +the func_info for each specific ELF section.:: > + > + struct btf_ext_info_sec { > + __u32 sec_name_off; /* offset to section name */ > + __u32 num_info; > + /* Followed by num_info * record_size number of bytes */ > + __u8 data[0]; > + }; > + > +Here, num_info must be greater than 0. > + > +The line_info is organized as below.:: > + > + line_info_rec_size > + btf_ext_info_sec for section #1 /* line_info for section #1 */ > + btf_ext_info_sec for section #2 /* line_info for section #2 */ > + ... > + > +``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure > +when .BTF.ext is generated. > + > +The interpretation of ``bpf_func_info->insn_off`` and > +``bpf_line_info->insn_off`` is different between kernel API and ELF API. > +For kernel API, the ``insn_off`` is the instruction offset in the unit > +of ``struct bpf_insn``. For ELF API, the ``insn_off`` is the byte offset > +from the beginning of section (``btf_ext_info_sec->sec_name_off``). > + > +5. Using BTF > +************ > + > +5.1 bpftool map pretty print > +============================ > + > +With BTF, the map key/value can be printed based on fields rather than > +simply raw bytes. This is especially > +valuable for large structure or if you data structure > +has bitfields. For example, for the following map,:: > + > + enum A { A1, A2, A3, A4, A5 }; > + typedef enum A ___A; > + struct tmp_t { > + char a1:4; > + int a2:4; > + int :4; > + __u32 a3:4; > + int b; > + ___A b1:4; > + enum A b2:4; > + }; > + struct bpf_map_def SEC("maps") tmpmap = { > + .type = BPF_MAP_TYPE_ARRAY, > + .key_size = sizeof(__u32), > + .value_size = sizeof(struct tmp_t), > + .max_entries = 1, > + }; > + BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t); > + > +bpftool is able to pretty print like below: > +:: > + > + [{ > + "key": 0, > + "value": { > + "a1": 0x2, > + "a2": 0x4, > + "a3": 0x6, > + "b": 7, > + "b1": 0x8, > + "b2": 0xa > + } > + } > + ] > + > +5.2 bpftool prog dump > +===================== > + > +The following is an example to show func_info and line_info > +can help prog dump with better ksym name, function prototype > +and line information.:: > + > + $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv > + [...] > + int test_long_fname_2(struct dummy_tracepoint_args * arg): > + bpf_prog_44a040bf25481309_test_long_fname_2: > + ; static int test_long_fname_2(struct dummy_tracepoint_args *arg) > + 0: push %rbp > + 1: mov %rsp,%rbp > + 4: sub $0x30,%rsp > + b: sub $0x28,%rbp > + f: mov %rbx,0x0(%rbp) > + 13: mov %r13,0x8(%rbp) > + 17: mov %r14,0x10(%rbp) > + 1b: mov %r15,0x18(%rbp) > + 1f: xor %eax,%eax > + 21: mov %rax,0x20(%rbp) > + 25: xor %esi,%esi > + ; int key = 0; > + 27: mov %esi,-0x4(%rbp) > + ; if (!arg->sock) > + 2a: mov 0x8(%rdi),%rdi > + ; if (!arg->sock) > + 2e: cmp $0x0,%rdi > + 32: je 0x0000000000000070 > + 34: mov %rbp,%rsi > + ; counts = bpf_map_lookup_elem(&btf_map, &key); > + [...] > + > +5.3 verifier log > +================ > + > +The following is an example how line_info can help verifier failure debug.:: > + > + /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c > + * is modified as below. > + */ > + data = (void *)(long)xdp->data; > + data_end = (void *)(long)xdp->data_end; > + /* > + if (data + 4 > data_end) > + return XDP_DROP; > + */ > + *(u32 *)data = dst->dst; > + > + $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline > type xdp > + ; data = (void *)(long)xdp->data; > + 224: (79) r2 = *(u64 *)(r10 -112) > + 225: (61) r2 = *(u32 *)(r2 +0) > + ; *(u32 *)data = dst->dst; > + 226: (63) *(u32 *)(r2 +0) = r1 > + invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0) > + R2 offset is outside of the packet > + > +6. BTF Generation > +***************** > + > +You need latest pahole > + > + https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ > + > +or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't > support .BTF.ext > +and btf BTF_KIND_FUNC type yet. For example,:: > + > + -bash-4.4$ cat t.c > + struct t { > + int a:2; > + int b:3; > + int c:2; > + } g; > + -bash-4.4$ gcc -c -O2 -g t.c > + -bash-4.4$ pahole -JV t.o > + File t.o: > + [1] STRUCT t kind_flag=1 size=4 vlen=3 > + a type_id=2 bitfield_size=2 bits_offset=0 > + b type_id=2 bitfield_size=3 bits_offset=2 > + c type_id=2 bitfield_size=2 bits_offset=5 > + [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED > + > +The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf > target only. > +The assembly code (-S) is able to show the BTF encoding in assembly format.:: > + > + -bash-4.4$ cat t2.c > + typedef int __int32; > + struct t2 { > + int a2; > + int (*f2)(char q1, __int32 q2, ...); > + int (*f3)(); > + } g2; > + int main() { return 0; } > + int test() { return 0; } > + -bash-4.4$ clang -c -g -O2 -target bpf t2.c > + -bash-4.4$ readelf -S t2.o > + ...... > + [ 8] .BTF PROGBITS 0000000000000000 00000247 > + 000000000000016e 0000000000000000 0 0 1 > + [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5 > + 0000000000000060 0000000000000000 0 0 1 > + [10] .rel.BTF.ext REL 0000000000000000 000007e0 > + 0000000000000040 0000000000000010 16 9 8 > + ...... > + -bash-4.4$ clang -S -g -O2 -target bpf t2.c > + -bash-4.4$ cat t2.s > + ...... > + .section .BTF,"",@progbits > + .short 60319 # 0xeb9f > + .byte 1 > + .byte 0 > + .long 24 > + .long 0 > + .long 220 > + .long 220 > + .long 122 > + .long 0 # BTF_KIND_FUNC_PROTO(id = 1) > + .long 218103808 # 0xd000000 > + .long 2 > + .long 83 # BTF_KIND_INT(id = 2) > + .long 16777216 # 0x1000000 > + .long 4 > + .long 16777248 # 0x1000020 > + ...... > + .byte 0 # string offset=0 > + .ascii ".text" # string offset=1 > + .byte 0 > + .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7 > + .byte 0 > + .ascii "int main() { return 0; }" # string offset=33 > + .byte 0 > + .ascii "int test() { return 0; }" # string offset=58 > + .byte 0 > + .ascii "int" # string offset=83 > + ...... > + .section .BTF.ext,"",@progbits > + .short 60319 # 0xeb9f > + .byte 1 > + .byte 0 > + .long 24 > + .long 0 > + .long 28 > + .long 28 > + .long 44 > + .long 8 # FuncInfo > + .long 1 # FuncInfo section string > offset=1 > + .long 2 > + .long .Lfunc_begin0 > + .long 3 > + .long .Lfunc_begin1 > + .long 5 > + .long 16 # LineInfo > + .long 1 # LineInfo section string > offset=1 > + .long 2 > + .long .Ltmp0 > + .long 7 > + .long 33 > + .long 7182 # Line 7 Col 14 > + .long .Ltmp3 > + .long 7 > + .long 58 > + .long 8206 # Line 8 Col 14 > + > +7. Testing > +********** > + > +Kernel bpf selftest `test_btf.c` provides extensive set of BTF related tests. > diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst > index 00a8450a602f..4e77932959cc 100644 > --- a/Documentation/bpf/index.rst > +++ b/Documentation/bpf/index.rst > @@ -15,6 +15,13 @@ that goes into great technical depth about the BPF > Architecture. > The primary info for the bpf syscall is available in the `man-pages`_ > for `bpf(2)`_. > > +BPF Type Format (BTF) > +===================== > + > +.. toctree:: > + :maxdepth: 1 > + > + btf > > > Frequently asked questions (FAQ)