On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hello, > > This series adds support for a new attribute, "btf_decl_tag" in GCC. > The same attribute is already supported in clang, and is used by various > components of the BPF ecosystem. > > The purpose of the attribute is to allow to associate (to "tag") > declarations with arbitrary string annotations, which are emitted into > debugging information (DWARF and/or BTF) to facilitate post-compilation > analysis (the motivating use case being the Linux kernel BPF verifier). > Multiple tags are allowed on the same declaration. > > These strings are not interpreted by the compiler, and the attribute > itself has no effect on generated code, other than to produce additional > DWARF DIEs and/or BTF records conveying the annotations. > > This entails: > > - A new C-language-level attribute which allows to associate (to "tag") > particular declarations with arbitrary strings. > > - The conveyance of that information in DWARF in the form of a new DIE, > DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > that of the DW_TAG_LLVM_annotation extension supported in LLVM for > the same purpose. These DIEs are already supported by BPF tooling, > such as pahole. > > - The conveyance of that information in BTF debug info in the form of > BTF_KIND_DECL_TAG records. These records are already supported by > LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > eBPF verifier. > > > Background > ========== > > The purpose of these tags is to convey additional semantic information > to post-compilation consumers, in particular the Linux kernel eBPF > verifier. The verifier can make use of that information while analyzing > a BPF program to aid in determining whether to allow or reject the > program to be run. More background on these tags can be found in the > early support for them in the kernel here [1] and [2]. > > The "btf_decl_tag" attribute is half the story; the other half is a > sibling attribute "btf_type_tag" which serves the same purpose but > applies to types. Support for btf_type_tag will come in a separate > patch series, since it is impaced by GCC bug 110439 which needs to be > addressed first. > > I submitted an initial version of this work (including btf_type_tag) > last spring [3], however at the time there were some open questions > about the behavior of the btf_type_tag attribute and issues with its > implementation. Since then we have clarified these details and agreed > to solutions with the BPF community and LLVM BPF folks. > > The main motivation for emitting the tags in DWARF is that the Linux > kernel generates its BTF information via pahole, using DWARF as a source: > > +--------+ BTF BTF +----------+ > | pahole |-------> vmlinux.btf ------->| verifier | > +--------+ +----------+ > ^ ^ > | | > DWARF | BTF | > | | > vmlinux +-------------+ > module1.ko | BPF program | > module2.ko +-------------+ > ... > > This is because: > > a) pahole adds additional kernel-specific information into the > produced BTF based on additional analysis of kernel objects. > > b) Unlike GCC, LLVM will only generate BTF for BPF programs. > > b) GCC can generate BTF for whatever target with -gbtf, but there is no > support for linking/deduplicating BTF in the linker. > > In the scenario above, the verifier needs access to the pointer tags of > both the kernel types/declarations (conveyed in the DWARF and translated > to BTF by pahole) and those of the BPF program (available directly in BTF). > > > DWARF Representation > ==================== > > As noted above, btf_decl_tag is represented in DWARF via a new DIE > DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > the following format: > > DW_TAG_GNU_annotation (0x6000) > DW_AT_name: "btf_decl_tag" > DW_AT_const_value: <string argument> > > These DIEs are placed in the DWARF tree as children of the DIE for the > appropriate declaration, and one such DIE is created for each occurrence > of the btf_decl_tag attribute on a declaration. > > For example: > > const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag > ("devicemem"))); > > This declaration produces the following DWARF: > > <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > <1f> DW_AT_name : c > <24> DW_AT_type : <0x49> > ... > <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <44> DW_AT_const_value : __c > <2><48>: Abbrev Number: 0 > <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > ... > > The DIEs for btf_decl_tag are placed as children of the DIE for > variable "c".
It looks like a bit of overkill, and inefficient as well. Why's the tags not referenced via the existing DW_AT_description? Iff you want new TAGs why require them as children for each DIE rather than referencing (and sharing!) them via a DIE reference from a new attribute? That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. But well ... Richard. > > BTF Representation > ================== > > In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer > to the annotated object by BTF type ID, as well as a component index which is > used for btf_decl_tags placed on struct/union members or function arguments. > > For example, the BTF for the above declaration is: > > [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > [2] CONST '(anon)' type_id=1 > [3] PTR '(anon)' type_id=2 > [4] DECL_TAG '__c' type_id=6 component_idx=-1 > [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 > [6] VAR 'c' type_id=3, linkage=global > ... > > The BTF format is documented here [4]. > > > References > ========== > > [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/ > [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-...@fb.com/ > [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > > > David Faust (9): > c-family: add btf_decl_tag attribute > include: add BTF decl tag defines > dwarf: create annotation DIEs for decl tags > dwarf: expose get_die_parent > ctf: add support to pass through BTF tags > dwarf2ctf: convert annotation DIEs to CTF types > btf: create and output BTF_KIND_DECL_TAG types > testsuite: add tests for BTF decl tags > doc: document btf_decl_tag attribute > > gcc/btfout.cc | 81 ++++++++++++++++++- > gcc/c-family/c-attribs.cc | 23 ++++++ > gcc/ctf-int.h | 28 +++++++ > gcc/ctfc.cc | 10 ++- > gcc/ctfc.h | 17 +++- > gcc/doc/extend.texi | 47 +++++++++++ > gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > gcc/dwarf2out.cc | 37 ++++++++- > gcc/dwarf2out.h | 1 + > .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > include/btf.h | 14 +++- > include/dwarf2.def | 4 + > 17 files changed, 437 insertions(+), 15 deletions(-) > create mode 100644 gcc/ctf-int.h > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c > > -- > 2.40.1 >