Ping.

On 2/20/25 14:24, David Faust wrote:
> 
> Gentle ping for this series.
> https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675241.html
> 
> On 2/6/25 11:54, David Faust wrote:
>> [v1: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666911.html
>>  Changes from v1:
>>  - Fix a bug in v1 related to generating DWARF for type tags applied to
>>    struct or union types, especially if the type had multiple type tags
>>    or was also part of a typedef.
>>  - Simplified the dwarf2ctf translation of types having both cv-qualifiers
>>    and type tags applied to them.
>>  - Add a few new tests.
>>  - Address review comments from v1.  ]
>>
>> This patch series adds support for the btf_decl_tag and btf_type_tag 
>> attributes
>> to GCC. This entails:
>>
>> - Two new C-family attributes that allow to associate (to "tag") particular
>>   declarations and types with arbitrary strings. As explained below, this is
>>   intended to be used to, for example, characterize certain pointer types.  A
>>   single declaration or type may have multiple occurrences of these 
>> attributes.
>>
>> - The conveyance of that information in the DWARF output in the form of a new
>>   DIE: DW_TAG_GNU_annotation, and a new attribute: DW_AT_GNU_annotation.
>>
>> - The conveyance of that information in the BTF output in the form of two new
>>   kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. These BTF
>>   kinds are already supported by LLVM and other tools in the BPF ecosystem.
>>
>> Both of these attributes are already supported by clang, and beginning to be
>> used in various ways by BPF users and inside the Linux kernel.
>>
>> It is worth noting that while the Linux kernel and BPF/BTF is the motivating 
>> use
>> case of this feature, the format of the new DWARF extension is generic.  This
>> work could be easily adapted to provide a general way for program authors to
>> annotate types and declarations with arbitrary information for any
>> post-compilation analysis needs, not just the Linux kernel BPF verifier.  For
>> example, these annotations could be used to aid in ABI analysis.
>>
>> Purpose
>> =======
>>
>> 1)  Addition of C-family language constructs (attributes) to specify 
>> free-text
>>     tags on certain language elements, such as struct fields.
>>
>>     The purpose of these annotations is to provide additional information 
>> about
>>     types, variables, and function parameters of interest to the kernel. A
>>     driving use case is to tag pointer types within the Linux kernel and BPF
>>     programs with additional semantic information, such as '__user' or 
>> '__rcu'.
>>
>>     For example, consider the Linux kernel function do_execve with the
>>     following declaration:
>>
>>       static int do_execve(struct filename *filename,
>>          const char __user *const __user *__argv,
>>          const char __user *const __user *__envp);
>>
>>     Here, __user could be defined with these annotations to record semantic
>>     information about the pointer parameters (e.g., they are user-provided) 
>> in
>>     DWARF and BTF information. Other kernel facilities such as the BPF 
>> verifier
>>     can read the tags and make use of the information.
>>
>> 2)  Conveying the tags in the generated DWARF debug info.
>>
>>     The main motivation for emitting the tags in DWARF is that the Linux 
>> kernel
>>     generates its BTF information via pahole, using DWARF as a source:
>>
>>         +--------+  BTF                  BTF   +----------+
>>         | pahole |-------> vmlinux.btf ------->| verifier |
>>         +--------+                             +----------+
>>             ^                                        ^
>>             |                                        |
>>       DWARF |                                    BTF |
>>             |                                        |
>>          vmlinux                              +-------------+
>>          module1.ko                           | BPF program |
>>          module2.ko                           +-------------+
>>            ...
>>
>>     This is because:
>>
>>     a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>
>>     b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>         support for linking/deduplicating BTF in the linker.
>>
>>     c)  pahole injects additional BTF information based on specific knowledge
>>         of kernel objects which is not available to the compiler.
>>
>>     In the scenario above, the verifier needs access to the pointer tags of
>>     both the kernel types/declarations (conveyed in the DWARF and translated
>>     to BTF by pahole) and those of the BPF program (available directly in 
>> BTF).
>>
>>     Another motivation for having the tag information in DWARF, unrelated to
>>     BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>     to benefit from these tags in order to differentiate between different
>>     kinds of pointers in the kernel.
>>
>> 3)  Conveying the tags in the generated BTF debug info.
>>
>>     This is easy: the main purpose of having this info in BTF is for the
>>     compiled BPF programs. The kernel verifier can then access the tags
>>     of pointers used by the BPF programs.
>>
>> For more information about these tags and the motivation behind them, please
>> refer to the following Linux kernel discussions: [1], [2], [3].
>>
>> DWARF Representation
>> ====================
>>
>> Compared to prior iterations of this work, this patch series introduces a new
>> DWARF representation meant to address issues in the previously proposed 
>> format.
>> The format is detailed below.
>>
>> Note that the obvious solution of introducing a new DIE to be chained in type
>> chains similar to type modifiers like const and volatile is not feasible
>> because it would break DWARF readers.
>>
>> New DWARF extension: DW_TAG_GNU_annotation.  These DIEs encode the annotation
>> information.  They exist near the top level of the DIE tree as children of 
>> the
>> compilation unit DIE.  The user-supplied annotations ("tags") are encoded via
>> DW_AT_name and DW_AT_const_value.  DW_AT_name holds the name of the attribute
>> which is the source of the annotation (currently only "btf_type_tag" or
>> "btf_decl_tag").  DW_AT_const_value holds the arbitrary user string from the
>> attribute argument.
>>
>>   DW_TAG_GNU_annotation
>>     DW_AT_name: "btf_decl_tag" or "btf_type_tag"
>>     DW_AT_const_value: <arbitrary user-provided string from attribute arg>
>>     DW_AT_GNU_annotation: see below.
>>
>> New DWARF extension: DW_AT_GNU_annotation.  If present, the
>> DW_AT_GNU_annotation attribute is a reference to a DW_TAG_GNU_annotation DIE
>> holding annotations for the object.
>>
>> If a single declaration or type at the language level has multiple 
>> occurrences
>> of btf_decl_tag or btf_type_tag attribute, then the DW_TAG_GNU_annotation DIE
>> referenced by that object will itself have DW_AT_GNU_annotation referring to
>> another annotation DIE.  In this way the annotation DIEs are chained 
>> together.
>>
>> Multiple distinct declarations or types may refer via DW_AT_GNU_annotation to
>> the same DW_TAG_GNU_annotation DIE, if they share the same tags.
>>
>> For more information on this format, please refer to recent talks at GNU 
>> Tools
>> Cauldron [4] and Linux Plumbers Conference [5]. Older iterations of this work
>> and related discussions may be found in [6,7,8].
>>
>> BTF Representation
>> ==================
>>
>> In BTF, BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records convey the 
>> annotations.
>> These records hold the annotation value in their name field, and refer to the
>> annotated object by BTF ID.
>>
>> BTF_KIND_DECL_TAG records are followed by an additional 32-bit 
>> 'component_idx',
>> which indicates to which component of an object the tag applies.  This index
>> is -1 if the tag applies to a variable or function declaration itself,
>> otherwise it is a 0-based index indicating to which function argument or 
>> struct
>> or union member the tag applies.
>>
>> Example: btf_decl_tag
>> =====================
>>
>> Consider the following declarations:
>>
>>   int  *x __attribute__((btf_decl_tag ("rw"), btf_decl_tag ("devicemem")));
>>   struct {
>>     int size;
>>     char *ptr __attribute__((btf_decl_tag("rw")));
>>   } y;
>>
>> These declarations produce the following DWARF information:
>>
>>  <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>     <1f>   DW_AT_name        : x
>>     <24>   DW_AT_type        : <0x36>
>>     <28>   DW_TAG_GNU_annotation: <0x4a>
>>     ...
>>  <1><36>: Abbrev Number: 1 (DW_TAG_pointer_type)
>>     <37>   DW_AT_byte_size   : 8
>>     <37>   DW_AT_type        : <0x3b>
>>  <1><3b>: Abbrev Number: 4 (DW_TAG_base_type)
>>     <3e>   DW_AT_name        : int
>>     ...
>>  <1><42>: Abbrev Number: 5 (DW_TAG_GNU_annotation)
>>     <43>   DW_AT_name        : (indirect string, offset: 0): btf_decl_tag
>>     <47>   DW_AT_const_value : rw
>>  <1><4a>: Abbrev Number: 6 (DW_TAG_GNU_annotation)
>>     <4b>   DW_AT_name        : (indirect string, offset: 0): btf_decl_tag
>>     <4f>   DW_AT_const_value : (indirect string, offset: 0x1f): devicemem
>>     <53>   DW_AT_GNU_annotation: <0x42>
>>  <1><57>: Abbrev Number: 7 (DW_TAG_structure_type)
>>     ...
>>  <2><60>: Abbrev Number: 8 (DW_TAG_member)
>>     <61>   DW_AT_name        : (indirect string, offset: 0x1a): size
>>     <68>   DW_AT_type        : <0x3b>
>>     ...
>>  <2><6d>: Abbrev Number: 9 (DW_TAG_member)
>>     <6e>   DW_AT_name        : ptr
>>     <75>   DW_AT_type        : <0x7f>
>>     <7a>   DW_AT_GNU_annotation: <0x42>
>>     ...
>>  <2><7e>: Abbrev Number: 0
>>  <1><7f>: Abbrev Number: 1 (DW_TAG_pointer_type)
>>     <80>   DW_AT_byte_size   : 8
>>     <80>   DW_AT_type        : <0x84>
>>  <1><84>: Abbrev Number: 10 (DW_TAG_base_type)
>>     <85>   DW_AT_byte_size   : 1
>>     <86>   DW_AT_encoding    : 6     (signed char)
>>     <87>   DW_AT_name        : (indirect string, offset: 0x5e): char
>>  <1><8b>: Abbrev Number: 11 (DW_TAG_variable)
>>     <8c>   DW_AT_name        : y
>>     <91>   DW_AT_type        : <0x57>
>>     ...
>>
>> The variable DIE for 'x' refers by DW_AT_GNU_annotation to the DIE holding 
>> the
>> annotation for the "devicemem" tag, which in turn refers to the DIE holding
>> the annotation for "rw".  The DW_TAG_member DIE for the member 'ptr' of the
>> struct refers to the annotation die for "rw" directly, which is thereby 
>> shared
>> between the two declarations.
>>
>> And BTF information:
>>
>>   [1] STRUCT '(anon)' size=16 vlen=2
>>       'size' type_id=2 bits_offset=0
>>       'ptr' type_id=3 bits_offset=64
>>   [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>   [3] PTR '(anon)' type_id=4
>>   [4] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
>>   [5] PTR '(anon)' type_id=2
>>   [6] DECL_TAG 'devicemem' type_id=10 component_idx=-1
>>   [7] DECL_TAG 'rw' type_id=10 component_idx=-1
>>   [8] DECL_TAG 'rw' type_id=1 component_idx=1
>>   [9] VAR 'y' type_id=1, linkage=global
>>   [10] VAR 'x' type_id=5, linkage=global
>>
>> Note how the component_idx identifies to which member of the struct type the
>> decl tag is applied.
>>
>>
>> Example: btf_type_tag
>> =====================
>>
>> Consider the following code snippet:
>>
>>   int __attribute__((btf_type_tag("rcu"), btf_type_tag ("foo"))) x;
>>
>>   void
>>   do_thing (struct S * __attribute__((btf_type_tag ("rcu"))) rcu_s,
>>             void * __attribute__((btf_type_tag("foo"))) ptr)
>>   { ... }
>>
>> The relevant DWARF information produced is as follows:
>>
>>  <1><2e>: Abbrev Number: 3 (DW_TAG_structure_type)
>>     <2f>   DW_AT_name        : S
>>     ...
>>  <1><46>: Abbrev Number: 5 (DW_TAG_base_type)
>>     <47>   DW_AT_byte_size   : 4
>>     <48>   DW_AT_encoding    : 5     (signed)
>>     <49>   DW_AT_name        : int
>>  <1><4d>: Abbrev Number: 6 (DW_TAG_variable)
>>     <4e>   DW_AT_name        : x
>>     <53>   DW_AT_type        : <0x61>
>>     ...
>>  <1><61>: Abbrev Number: 7 (DW_TAG_base_type)
>>     <62>   DW_AT_byte_size   : 4
>>     <63>   DW_AT_encoding    : 5     (signed)
>>     <64>   DW_AT_name        : int
>>     <68>   DW_AT_GNU_annotation: <0x75>
>>  <1><6c>: Abbrev Number: 1 (DW_TAG_GNU_annotation)
>>     <6d>   DW_AT_name        : (indirect string, offset: 0x13): btf_type_tag
>>     <71>   DW_AT_const_value : rcu
>>  <1><75>: Abbrev Number: 8 (DW_TAG_GNU_annotation)
>>     <76>   DW_AT_name        : (indirect string, offset: 0x13): btf_type_tag
>>     <7a>   DW_AT_const_value : foo
>>     <7e>   DW_AT_GNU_annotation: <0x6c>
>>  <1><82>: Abbrev Number: 9 (DW_TAG_subprogram)
>>     <83>   DW_AT_name        : (indirect string, offset: 0x20): do_thing
>>     ...
>>  <2><a1>: Abbrev Number: 10 (DW_TAG_formal_parameter)
>>     <a2>   DW_AT_name        : (indirect string, offset: 0x5): rcu_s
>>     <a9>   DW_AT_type        : <0xc0>
>>     ...
>>  <2><b0>: Abbrev Number: 11 (DW_TAG_formal_parameter)
>>     <b1>   DW_AT_name        : ptr
>>     <b8>   DW_AT_type        : <0xca>
>>     ...
>>  <2><bf>: Abbrev Number: 0
>>  <1><c0>: Abbrev Number: 12 (DW_TAG_pointer_type)
>>     <c1>   DW_AT_byte_size   : 8
>>     <c2>   DW_AT_type        : <0x2e>
>>     <c6>   Unknown AT value: 6000: <0x6c>
>>  <1><ca>: Abbrev Number: 13 (DW_TAG_pointer_type)
>>     <cb>   DW_AT_byte_size   : 8
>>     <cc>   DW_AT_GNU_annotation: <0xd0>
>>  <1><d0>: Abbrev Number: 1 (DW_TAG_GNU_annotation)
>>     <d1>   DW_AT_name        : (indirect string, offset: 0x13): btf_type_tag
>>     <d5>   DW_AT_const_value : foo
>>
>> Note how in this case, two annotation DIEs for "foo" are produced, because
>> it is used in two distinct sets of type tags which do not allow it to be
>> shared. The DIE for "rcu", however, is shared between uses.
>>
>> And BTF information:
>>
>>   [1] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2
>>       'rcu_s' type_id=2
>>       'ptr' type_id=6
>>   [2] TYPE_TAG 'rcu' type_id=3
>>   [3] PTR '(anon)' type_id=4
>>   [4] STRUCT 'S' size=4 vlen=1
>>       ...
>>   [6] TYPE_TAG 'foo' type_id=7
>>   [7] PTR '(anon)' type_id=0
>>   [8] TYPE_TAG 'foo' type_id=9
>>   [9] TYPE_TAG 'rcu' type_id=10
>>   [10] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>   [11] VAR 'x' type_id=8, linkage=global
>>   [12] FUNC 'do_thing' type_id=1 linkage=global
>>
>> References
>> ==========
>>
>> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
>> [2] https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
>> [3] https://lore.kernel.org/bpf/20211112012604.1504583-1-...@fb.com/
>> [4] 
>> https://gcc.gnu.org/wiki/cauldron2024#cauldron2024talks.what_is_new_in_the_bpf_support_in_the_gnu_toolchain
>> [5] https://lpc.events/event/18/contributions/1924/
>> [6] https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592685.html
>> [7] https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596355.html
>> [8] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624156.html
>>
>>
>> David Faust (5):
>>   c-family: add btf_type_tag and btf_decl_tag attributes
>>   dwarf: create annotation DIEs for btf tags
>>   ctf: translate annotation DIEs to internal ctf
>>   btf: generate and output DECL_TAG and TYPE_TAG records
>>   doc: document btf_type_tag and btf_decl_tag attributes
>>
>>  gcc/btfout.cc                                 | 171 +++++++++--
>>  gcc/c-family/c-attribs.cc                     |  25 +-
>>  gcc/ctfc.cc                                   |  70 ++++-
>>  gcc/ctfc.h                                    |  41 ++-
>>  gcc/doc/extend.texi                           |  68 +++++
>>  gcc/dwarf2ctf.cc                              | 152 +++++++++-
>>  gcc/dwarf2out.cc                              | 275 +++++++++++++++++-
>>  .../gcc.dg/debug/btf/btf-decl-tag-1.c         |  14 +
>>  .../gcc.dg/debug/btf/btf-decl-tag-2.c         |  22 ++
>>  .../gcc.dg/debug/btf/btf-decl-tag-3.c         |  22 ++
>>  .../gcc.dg/debug/btf/btf-decl-tag-4.c         |  34 +++
>>  .../gcc.dg/debug/btf/btf-type-tag-1.c         |  27 ++
>>  .../gcc.dg/debug/btf/btf-type-tag-2.c         |  15 +
>>  .../gcc.dg/debug/btf/btf-type-tag-3.c         |  21 ++
>>  .../gcc.dg/debug/btf/btf-type-tag-4.c         |  25 ++
>>  .../gcc.dg/debug/btf/btf-type-tag-5.c         |  35 +++
>>  .../gcc.dg/debug/btf/btf-type-tag-6.c         |  15 +
>>  .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c     |  23 ++
>>  .../gcc.dg/debug/ctf/ctf-decl-tag-1.c         |  31 ++
>>  .../gcc.dg/debug/ctf/ctf-type-tag-1.c         |  19 ++
>>  .../debug/dwarf2/dwarf-btf-decl-tag-1.c       |  11 +
>>  .../debug/dwarf2/dwarf-btf-decl-tag-2.c       |  25 ++
>>  .../debug/dwarf2/dwarf-btf-decl-tag-3.c       |  21 ++
>>  .../debug/dwarf2/dwarf-btf-type-tag-1.c       |  10 +
>>  .../debug/dwarf2/dwarf-btf-type-tag-2.c       |  31 ++
>>  .../debug/dwarf2/dwarf-btf-type-tag-3.c       |  15 +
>>  include/btf.h                                 |  14 +
>>  include/ctf.h                                 |   4 +
>>  include/dwarf2.def                            |   4 +
>>  29 files changed, 1194 insertions(+), 46 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-decl-tag-3.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-btf-type-tag-3.c
>>
> 

Reply via email to