Ping.
On 2/20/25 14:24, David Faust wrote:
>
> Gentle ping for this series.
> https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675241.html
>
> On 2/6/25 11:54, David Faust wrote:
>> [v1: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666911.html
>> Changes from v1:
>> - Fix a bug in v1 related to generating DWARF for type tags applied to
>> struct or union types, especially if the type had multiple type tags
>> or was also part of a typedef.
>> - Simplified the dwarf2ctf translation of types having both cv-qualifiers
>> and type tags applied to them.
>> - Add a few new tests.
>> - Address review comments from v1. ]
>>
>> This patch series adds support for the btf_decl_tag and btf_type_tag
>> attributes
>> to GCC. This entails:
>>
>> - Two new C-family attributes that allow to associate (to "tag") particular
>> declarations and types with arbitrary strings. As explained below, this is
>> intended to be used to, for example, characterize certain pointer types. A
>> single declaration or type may have multiple occurrences of these
>> attributes.
>>
>> - The conveyance of that information in the DWARF output in the form of a new
>> DIE: DW_TAG_GNU_annotation, and a new attribute: DW_AT_GNU_annotation.
>>
>> - The conveyance of that information in the BTF output in the form of two new
>> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. These BTF
>> kinds are already supported by LLVM and other tools in the BPF ecosystem.
>>
>> Both of these attributes are already supported by clang, and beginning to be
>> used in various ways by BPF users and inside the Linux kernel.
>>
>> It is worth noting that while the Linux kernel and BPF/BTF is the motivating
>> use
>> case of this feature, the format of the new DWARF extension is generic. This
>> work could be easily adapted to provide a general way for program authors to
>> annotate types and declarations with arbitrary information for any
>> post-compilation analysis needs, not just the Linux kernel BPF verifier. For
>> example, these annotations could be used to aid in ABI analysis.
>>
>> Purpose
>> =======
>>
>> 1) Addition of C-family language constructs (attributes) to specify
>> free-text
>> tags on certain language elements, such as struct fields.
>>
>> The purpose of these annotations is to provide additional information
>> about
>> types, variables, and function parameters of interest to the kernel. A
>> driving use case is to tag pointer types within the Linux kernel and BPF
>> programs with additional semantic information, such as '__user' or
>> '__rcu'.
>>
>> For example, consider the Linux kernel function do_execve with the
>> following declaration:
>>
>> static int do_execve(struct filename *filename,
>> const char __user *const __user *__argv,
>> const char __user *const __user *__envp);
>>
>> Here, __user could be defined with these annotations to record semantic
>> information about the pointer parameters (e.g., they are user-provided)
>> in
>> DWARF and BTF information. Other kernel facilities such as the BPF
>> verifier
>> can read the tags and make use of the information.
>>
>> 2) Conveying the tags in the generated DWARF debug info.
>>
>> The main motivation for emitting the tags in DWARF is that the Linux
>> kernel
>> generates its BTF information via pahole, using DWARF as a source:
>>
>> +--------+ BTF BTF +----------+
>> | pahole |-------> vmlinux.btf ------->| verifier |
>> +--------+ +----------+
>> ^ ^
>> | |
>> DWARF | BTF |
>> | |
>> vmlinux +-------------+
>> module1.ko | BPF program |
>> module2.ko +-------------+
>> ...
>>
>> This is because:
>>
>> a) Unlike GCC, LLVM will only generate BTF for BPF programs.
>>
>> b) GCC can generate BTF for whatever target with -gbtf, but there is no
>> support for linking/deduplicating BTF in the linker.
>>
>> c) pahole injects additional BTF information based on specific knowledge
>> of kernel objects which is not available to the compiler.
>>
>> In the scenario above, the verifier needs access to the pointer tags of
>> both the kernel types/declarations (conveyed in the DWARF and translated
>> to BTF by pahole) and those of the BPF program (available directly in
>> BTF).
>>
>> Another motivation for having the tag information in DWARF, unrelated to
>> BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>> to benefit from these tags in order to differentiate between different
>> kinds of pointers in the kernel.
>>
>> 3) Conveying the tags in the generated BTF debug info.
>>
>> This is easy: the main purpose of having this info in BTF is for the
>> compiled BPF programs. The kernel verifier can then access the tags
>> of pointers used by the BPF programs.
>>
>> For more information about these tags and the motivation behind them, please
>> refer to the following Linux kernel discussions: [1], [2], [3].
>>
>> DWARF Representation
>> ====================
>>
>> Compared to prior iterations of this work, this patch series introduces a new
>> DWARF representation meant to address issues in the previously proposed
>> format.
>> The format is detailed below.
>>
>> Note that the obvious solution of introducing a new DIE to be chained in type
>> chains similar to type modifiers like const and volatile is not feasible
>> because it would break DWARF readers.
>>
>> New DWARF extension: DW_TAG_GNU_annotation. These DIEs encode the annotation
>> information. They exist near the top level of the DIE tree as children of
>> the
>> compilation unit DIE. The user-supplied annotations ("tags") are encoded via
>> DW_AT_name and DW_AT_const_value. DW_AT_name holds the name of the attribute
>> which is the source of the annotation (currently only "btf_type_tag" or
>> "btf_decl_tag"). DW_AT_const_value holds the arbitrary user string from the
>> attribute argument.
>>
>> DW_TAG_GNU_annotation
>> DW_AT_name: "btf_decl_tag" or "btf_type_tag"
>> DW_AT_const_value: <arbitrary user-provided string from attribute arg>
>> DW_AT_GNU_annotation: see below.
>>
>> New DWARF extension: DW_AT_GNU_annotation. If present, the
>> DW_AT_GNU_annotation attribute is a reference to a DW_TAG_GNU_annotation DIE
>> holding annotations for the object.
>>
>> If a single declaration or type at the language level has multiple
>> occurrences
>> of btf_decl_tag or btf_type_tag attribute, then the DW_TAG_GNU_annotation DIE
>> referenced by that object will itself have DW_AT_GNU_annotation referring to
>> another annotation DIE. In this way the annotation DIEs are chained
>> together.
>>
>> Multiple distinct declarations or types may refer via DW_AT_GNU_annotation to
>> the same DW_TAG_GNU_annotation DIE, if they share the same tags.
>>
>> For more information on this format, please refer to recent talks at GNU
>> Tools
>> Cauldron [4] and Linux Plumbers Conference [5]. Older iterations of this work
>> and related discussions may be found in [6,7,8].
>>
>> BTF Representation
>> ==================
>>
>> In BTF, BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records convey the
>> annotations.
>> These records hold the annotation value in their name field, and refer to the
>> annotated object by BTF ID.
>>
>> BTF_KIND_DECL_TAG records are followed by an additional 32-bit
>> 'component_idx',
>> which indicates to which component of an object the tag applies. This index
>> is -1 if the tag applies to a variable or function declaration itself,
>> otherwise it is a 0-based index indicating to which function argument or
>> struct
>> or union member the tag applies.
>>
>> Example: btf_decl_tag
>> =====================
>>
>> Consider the following declarations:
>>
>> int *x __attribute__((btf_decl_tag ("rw"), btf_decl_tag ("devicemem")));
>> struct {
>> int size;
>> char *ptr __attribute__((btf_decl_tag("rw")));
>> } y;
>>
>> These declarations produce the following DWARF information:
>>
>> <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>> <1f> DW_AT_name : x
>> <24> DW_AT_type : <0x36>
>> <28> DW_TAG_GNU_annotation: <0x4a>
>> ...
>> <1><36>: Abbrev Number: 1 (DW_TAG_pointer_type)
>> <37> DW_AT_byte_size : 8
>> <37> DW_AT_type : <0x3b>
>> <1><3b>: Abbrev Number: 4 (DW_TAG_base_type)
>> <3e> DW_AT_name : int
>> ...
>> <1><42>: Abbrev Number: 5 (DW_TAG_GNU_annotation)
>> <43> DW_AT_name : (indirect string, offset: 0): btf_decl_tag
>> <47> DW_AT_const_value : rw
>> <1><4a>: Abbrev Number: 6 (DW_TAG_GNU_annotation)
>> <4b> DW_AT_name : (indirect string, offset: 0): btf_decl_tag
>> <4f> DW_AT_const_value : (indirect string, offset: 0x1f): devicemem
>> <53> DW_AT_GNU_annotation: <0x42>
>> <1><57>: Abbrev Number: 7 (DW_TAG_structure_type)
>> ...
>> <2><60>: Abbrev Number: 8 (DW_TAG_member)
>> <61> DW_AT_name : (indirect string, offset: 0x1a): size
>> <68> DW_AT_type : <0x3b>
>> ...
>> <2><6d>: Abbrev Number: 9 (DW_TAG_member)
>> <6e> DW_AT_name : ptr
>> <75> DW_AT_type : <0x7f>
>> <7a> DW_AT_GNU_annotation: <0x42>
>> ...
>> <2><7e>: Abbrev Number: 0
>> <1><7f>: Abbrev Number: 1 (DW_TAG_pointer_type)
>> <80> DW_AT_byte_size : 8
>> <80> DW_AT_type : <0x84>
>> <1><84>: Abbrev Number: 10 (DW_TAG_base_type)
>> <85> DW_AT_byte_size : 1
>> <86> DW_AT_encoding : 6 (signed char)
>> <87> DW_AT_name : (indirect string, offset: 0x5e): char
>> <1><8b>: Abbrev Number: 11 (DW_TAG_variable)
>> <8c> DW_AT_name : y
>> <91> DW_AT_type : <0x57>
>> ...
>>
>> The variable DIE for 'x' refers by DW_AT_GNU_annotation to the DIE holding
>> the
>> annotation for the "devicemem" tag, which in turn refers to the DIE holding
>> the annotation for "rw". The DW_TAG_member DIE for the member 'ptr' of the
>> struct refers to the annotation die for "rw" directly, which is thereby
>> shared
>> between the two declarations.
>>
>> And BTF information:
>>
>> [1] STRUCT '(anon)' size=16 vlen=2
>> 'size' type_id=2 bits_offset=0
>> 'ptr' type_id=3 bits_offset=64
>> [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>> [3] PTR '(anon)' type_id=4
>> [4] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
>> [5] PTR '(anon)' type_id=2
>> [6] DECL_TAG 'devicemem' type_id=10 component_idx=-1
>> [7] DECL_TAG 'rw' type_id=10 component_idx=-1
>> [8] DECL_TAG 'rw' type_id=1 component_idx=1
>> [9] VAR 'y' type_id=1, linkage=global
>> [10] VAR 'x' type_id=5, linkage=global
>>
>> Note how the component_idx identifies to which member of the struct type the
>> decl tag is applied.
>>
>>
>> Example: btf_type_tag
>> =====================
>>
>> Consider the following code snippet:
>>
>> int __attribute__((btf_type_tag("rcu"), btf_type_tag ("foo"))) x;
>>
>> void
>> do_thing (struct S * __attribute__((btf_type_tag ("rcu"))) rcu_s,
>> void * __attribute__((btf_type_tag("foo"))) ptr)
>> { ... }
>>
>> The relevant DWARF information produced is as follows:
>>
>> <1><2e>: Abbrev Number: 3 (DW_TAG_structure_type)
>> <2f> DW_AT_name : S
>> ...
>> <1><46>: Abbrev Number: 5 (DW_TAG_base_type)
>> <47> DW_AT_byte_size : 4
>> <48> DW_AT_encoding : 5 (signed)
>> <49> DW_AT_name : int
>> <1><4d>: Abbrev Number: 6 (DW_TAG_variable)
>> <4e> DW_AT_name : x
>> <53> DW_AT_type : <0x61>
>> ...
>> <1><61>: Abbrev Number: 7 (DW_TAG_base_type)
>> <62> DW_AT_byte_size : 4
>> <63> DW_AT_encoding : 5 (signed)
>> <64> DW_AT_name : int
>> <68> DW_AT_GNU_annotation: <0x75>
>> <1><6c>: Abbrev Number: 1 (DW_TAG_GNU_annotation)
>> <6d> DW_AT_name : (indirect string, offset: 0x13): btf_type_tag
>> <71> DW_AT_const_value : rcu
>> <1><75>: Abbrev Number: 8 (DW_TAG_GNU_annotation)
>> <76> DW_AT_name : (indirect string, offset: 0x13): btf_type_tag
>> <7a> DW_AT_const_value : foo
>> <7e> DW_AT_GNU_annotation: <0x6c>
>> <1><82>: Abbrev Number: 9 (DW_TAG_subprogram)
>> <83> DW_AT_name : (indirect string, offset: 0x20): do_thing
>> ...
>> <2><a1>: Abbrev Number: 10 (DW_TAG_formal_parameter)
>> <a2> DW_AT_name : (indirect string, offset: 0x5): rcu_s
>> <a9> DW_AT_type : <0xc0>
>> ...
>> <2><b0>: Abbrev Number: 11 (DW_TAG_formal_parameter)
>> <b1> DW_AT_name : ptr
>> <b8> DW_AT_type : <0xca>
>> ...
>> <2><bf>: Abbrev Number: 0
>> <1><c0>: Abbrev Number: 12 (DW_TAG_pointer_type)
>> <c1> DW_AT_byte_size : 8
>> <c2> DW_AT_type : <0x2e>
>> <c6> Unknown AT value: 6000: <0x6c>
>> <1><ca>: Abbrev Number: 13 (DW_TAG_pointer_type)
>> <cb> DW_AT_byte_size : 8
>> <cc> DW_AT_GNU_annotation: <0xd0>
>> <1><d0>: Abbrev Number: 1 (DW_TAG_GNU_annotation)
>> <d1> DW_AT_name : (indirect string, offset: 0x13): btf_type_tag
>> <d5> DW_AT_const_value : foo
>>
>> Note how in this case, two annotation DIEs for "foo" are produced, because
>> it is used in two distinct sets of type tags which do not allow it to be
>> shared. The DIE for "rcu", however, is shared between uses.
>>
>> And BTF information:
>>
>> [1] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2
>> 'rcu_s' type_id=2
>> 'ptr' type_id=6
>> [2] TYPE_TAG 'rcu' type_id=3
>> [3] PTR '(anon)' type_id=4
>> [4] STRUCT 'S' size=4 vlen=1
>> ...
>> [6] TYPE_TAG 'foo' type_id=7
>> [7] PTR '(anon)' type_id=0
>> [8] TYPE_TAG 'foo' type_id=9
>> [9] TYPE_TAG 'rcu' type_id=10
>> [10] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>> [11] VAR 'x' type_id=8, linkage=global
>> [12] FUNC 'do_thing' type_id=1 linkage=global
>>
>> References
>> ==========
>>
>> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
>> [2] https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
>> [3] https://lore.kernel.org/bpf/20211112012604.1504583-1-...@fb.com/
>> [4]
>> https://gcc.gnu.org/wiki/cauldron2024#cauldron2024talks.what_is_new_in_the_bpf_support_in_the_gnu_toolchain
>> [5] https://lpc.events/event/18/contributions/1924/
>> [6] https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592685.html
>> [7] https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596355.html
>> [8] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624156.html
>>
>>
>> David Faust (5):
>> c-family: add btf_type_tag and btf_decl_tag attributes
>> dwarf: create annotation DIEs for btf tags
>> ctf: translate annotation DIEs to internal ctf
>> btf: generate and output DECL_TAG and TYPE_TAG records
>> doc: document btf_type_tag and btf_decl_tag attributes
>>
>> gcc/btfout.cc | 171 +++++++++--
>> gcc/c-family/c-attribs.cc | 25 +-
>> gcc/ctfc.cc | 70 ++++-
>> gcc/ctfc.h | 41 ++-
>> gcc/doc/extend.texi | 68 +++++
>> gcc/dwarf2ctf.cc | 152 +++++++++-
>> gcc/dwarf2out.cc | 275 +++++++++++++++++-
>> .../gcc.dg/debug/btf/btf-decl-tag-1.c | 14 +
>> .../gcc.dg/debug/btf/btf-decl-tag-2.c | 22 ++
>> .../gcc.dg/debug/btf/btf-decl-tag-3.c | 22 ++
>> .../gcc.dg/debug/btf/btf-decl-tag-4.c | 34 +++
>> .../gcc.dg/debug/btf/btf-type-tag-1.c | 27 ++
>> .../gcc.dg/debug/btf/btf-type-tag-2.c | 15 +
>> .../gcc.dg/debug/btf/btf-type-tag-3.c | 21 ++
>> .../gcc.dg/debug/btf/btf-type-tag-4.c | 25 ++
>> .../gcc.dg/debug/btf/btf-type-tag-5.c | 35 +++
>> .../gcc.dg/debug/btf/btf-type-tag-6.c | 15 +
>> .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c | 23 ++
>> .../gcc.dg/debug/ctf/ctf-decl-tag-1.c | 31 ++
>> .../gcc.dg/debug/ctf/ctf-type-tag-1.c | 19 ++
>> .../debug/dwarf2/dwarf-btf-decl-tag-1.c | 11 +
>> .../debug/dwarf2/dwarf-btf-decl-tag-2.c | 25 ++
>> .../debug/dwarf2/dwarf-btf-decl-tag-3.c | 21 ++
>> .../debug/dwarf2/dwarf-btf-type-tag-1.c | 10 +
>> .../debug/dwarf2/dwarf-btf-type-tag-2.c | 31 ++
>> .../debug/dwarf2/dwarf-btf-type-tag-3.c | 15 +
>> include/btf.h | 14 +
>> include/ctf.h | 4 +
>> include/dwarf2.def | 4 +
>> 29 files changed, 1194 insertions(+), 46 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-3.c
>>
>