Hi people! Last year we submitted a first patch series introducing support for the CTF debugging format in GCC [1]. We got a lot of feedback that prompted us to change the approach used to generate the debug info, and this patch series is the result of that.
This implementation works, but there are several points that need discussion and agreement with the upstream community, as they impact the way debugging options work. We are also proposing a way to add additional debugging formats (such as BTF) in the future. See below for more details. [1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html About CTF ========= CTF is a debugging format designed in order to express C types in a very compact way. The key is compactness and simplicity. For more information see: - CTF specification http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf - Compact C-Type support in the GNU toolchain (talk + slides) https://linuxplumbersconf.org/event/4/contributions/396/ - On type de-duplication in CTF (talk + slides) https://linuxplumbersconf.org/event/7/contributions/725/ CTF in the GNU Toolchain ======================== During the last year we have been working in adding support for CTF to several components of the GNU toolchain: - binutils support is already upstream. It supports linking objects with CTF information with full type de-duplication. - GDB support is to be sent upstream very shortly. It makes the debugger capable to use the CTF information whenever available. This is useful in cases where DWARF has been stripped out but CTF is kept. - GCC support is being discussed and submitted in this series. >From debug hooks to debug formats ================================= Our first attempt in adding CTF to GCC used the obvious approach of adding a new set of debug hooks as defined in gcc/debug.h. During our first interaction with the upstream community we were told to _not_ use debug hooks, because these are to be obsoleted at some point. We were suggested to instead hook our handlers (which processed type TREE nodes producing CTF types from them) somewhere else. So we did. However at the time we were also facing the need to support BTF, which is another type-related debug format needed by the BPF GCC backend. Hooking here and there doesn't sound like such a good idea when it comes to support several debug formats. Therefore we thought about how to make GCC support diverse debugging formats in a better way. This led to a proposal we tried to discuss at the GNU Tools Track in LPC2020: - Update of the BPF support in the GNU Toolchain https://linuxplumbersconf.org/event/7/contributions/724/ Basically, the current situation in terms of diversity of debugging formats in GCC can be summarized in the following like: tree --+ +--> dwarf2out rtl --+ +--> dbxout +--> debug_hooks --+--> vmsdbgout backends --+ +--> xcoffout lto --+ +--> godump i.e. each debug format materializes in a set of debug hooks, as in gcc/debug.h. The installed hooks are then invoked from many different areas of the compiler including front-end, middle-end, back-end and also lto. Most of the hooks get TREE objects, from which they are supposed to extract/infer whatever information they need to express. This approach has several problems, some of which were raised by you people when we initially submitted the CTF support: - The handlers depend on the TREE nodes, so if new TREE nodes are added to cover new languages, or functionality in existing languages, all the debug hooks may need to be updated to reflect it. - This also happens when the contents of existing TREE node types change or get expanded. - The semantics encoded in TREE nodes usually are not in the best form to be used by debug formats. This implies that the several sets of debug hooks need to do very similar transformations, which again will have to be adjusted/corrected if the TREE nodes change. - And more... In contrast, this is how LLVM supports several debug formats: +--> DWARF IR --> class DebugHandlerBase --+--> CodeView +--> BTF i.e. LLVM gets debugging information as part of the IR, and then has debug info backends in the form of instances of DebugHandlerBase, which process that subset of the IR to produce whatever debug output. To overcome the problems above, we thought about introducing a new set of debug hooks, resulting in something like this: +--> godump +--> xcoffout debug_hooks -+--> vmsdbgout +--> dbxout +--> DWARF +--> dwarf2out --> n_debug_hooks --+--> BTF (walk) +--> CTF ... more ... See how these "new debug hooks" are intended to be called by the DWARF old debug hooks. In this way: - The internal DWARF representation becomes the canonical (and only) IR for debugging information in the compiler. This is similar to what LLVM uses to implement support for DWARF, BTF and the Microsoft debug format. - Debug formats (like CTF, BTF, stabs, etc) are implemented to provide a very simple API that traverses the DWARF DIE trees available in dwarf2out. - The semantics expressed in the DWARF DIEs, which have been already extracted from the TREE nodes, are free of many internal details and more suitable to be easily translated into whatever abstractions the debug formats require. To avoid misunderstandings, we got to refer to these "new debug hooks" simply as "debug formats". In this patch series we are using this later approach in order to support CTF, and we can say we are happy about using the internal DWARF DIEs as a source instead of TREE nodes: it led to a more natural implementation, much easier to understand. This sort of confirms in practice that the approach is sound. The debug format API ==================== As you can see in the patch series, we hooked CTF in dwarf2out_early_finish like this: /* Emit CTF debug info. */ if (ctf_debug_info_level > CTFINFO_LEVEL_NONE && lang_GNU_C ()) { ctf_debug_init (); debug_format_do_cu (comp_unit_die ()); for (limbo_die_node *node = limbo_die_list; node; node = node->next) debug_format_do_cu (node->die); ctf_debug_finalize (filename); } In turn, debug_format_do_cu traverses the tree of DIEs passed to it calling ctf_do_die on them. This conforms the debug format API: FOO_debug_init () Initialize the debug format FOO. FOO_debug_finalize (FILENAME) Possibly write out, cleanup and finalization for debug format FOO. FOO_do_die (DIE) Process the given DIE. Note how the emission of DWARF is interrupted after that point, if no DWARF was requested by the user. dwarf2out - dwarf2ctf ===================== The functions ctf_debug_init, ctf_do_die and ctf_debug_finalize, that implement the API described above, are all in gcc/dwarf2ctf.c. Obviously, these routines need access to the dwarf DIE data structures, and several functions which are defined in dwarf2out.[ch], many (most?) of which are private to that file: dw_die_ref, get_AT, etc. Therefore, in this implementation we opted by writing dwarf2ctf.c in a way it can just be #included in dwarf2ctf.c. A question remains: would it be better to abstract these types and functions in an API in dwarf2out.h? Command line options for debug formats ====================================== This implementation adds the following command-line options to select the emission of CTF: -gt[123] These options mimic the -g[123...] options for DWARF. This involved adding new entries for debug_info_type: CTF_DEBUG - Write CTF debug info. CTF_AND_DWARF2_DEBUG - Write both CTF and DWARF info. Doing this, we just followed the trend initiated by vmsdbgout.c, which added VMS_DEBUG and VMS_AND_DWARF2_DEBUG. This approach is not very good, because debug_info_type was designed to cover different debug hook implementations; debug formats, in contrast, are a different thing. This translates to problems and fragile behavior: - Everywhere write_symbols is checked we have to expand the logic to take the CTF values into account. You can see that is the case in this patch series. This is very fragile and doesn't scale well: we are most probably missing some checks. - The CTF debug format needs certain DWARF debug level (2) in order to work, since otherwise not enough type DIEs get generated. This will probably happen with some other formats as well. - Therefore, -gt implicitly sets the DWARF debug level to 2. But if the user uses -gt -g1, the CTF information will be incomplete because -g1 resets the DWARF debug level to 1. -gtoggle also presents difficulties. - Backends select what debug hooks to use by defining constants like DWARF2_DEBUGGING_INFO. Since the new debug formats are based on the DWARF debug hooks, that is the constant to define by the backends wanting to support DWARF + debug infos. However, some backends may want to use one of the debug formats by default, i.e. for -g. This is the case of the BPF backend, that needs to generate BTF instead of DWARF. Currently, there is no way to specify this. We could add a new optional backend hook/constant to select the desired default debug format, like: #define DWARF2_DEBUGGING_INFO /* Selects the dwarf debug hooks */ /* Selects the default debug format to emit with -g. */ #define CTF_DEBUGGING_FORMAT #define BTF_DEBUGGING_FORMAT #define DWARF_DEBUGGING_FORMAT /* The default */ Regardless of what debug format is defined as the default, the other formats are also available with -gdwarf, -gctf, -gbtf, etc. -gt or -gctf ============ This patch series uses -gt to trigger the generation of CTF debug data, but if we agree on the approach outlined in the last section for supporting debug formats in the backends, most likely we will want to use -gctf instead of -gt. Work in progress: BTF as a debug format ======================================= We are already working in adding support for the BTF debug format to GCC. This is needed by the BPF backend, which should generate BTF instead of DWARF. This is absolutely needed in order to compile BPF programs that work in the Linux kernel, as explained in the "Update of the BPF support in the GNU Toolchain" talk mentioned above. Since BTF is very similar to CTF, we are just adding support for BTF to the CTF implementation. In this way, ctfout.[ch] and dwarf2ctf.c provide two debug formats. Indu Bhagat (4): Add new function lang_GNU_GIMPLE CTF debug format CTF testsuite CTF documentation gcc/Makefile.in | 3 + gcc/common.opt | 9 + gcc/ctfout.c | 1579 +++++++++++++++++ gcc/ctfout.h | 322 ++++ gcc/doc/invoke.texi | 16 + gcc/dwarf2cfi.c | 3 +- gcc/dwarf2ctf.c | 816 +++++++++ gcc/dwarf2out.c | 32 +- gcc/final.c | 5 +- gcc/flag-types.h | 19 +- gcc/gengtype.c | 2 +- gcc/langhooks.c | 9 + gcc/langhooks.h | 1 + gcc/opts.c | 65 +- gcc/targhooks.c | 3 +- gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c | 6 + gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c | 10 + .../gcc.dg/debug/ctf/ctf-anonymous-struct-1.c | 23 + .../gcc.dg/debug/ctf/ctf-anonymous-union-1.c | 26 + gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c | 31 + gcc/testsuite/gcc.dg/debug/ctf/ctf-array-2.c | 38 + gcc/testsuite/gcc.dg/debug/ctf/ctf-array-3.c | 17 + gcc/testsuite/gcc.dg/debug/ctf/ctf-array-4.c | 13 + .../gcc.dg/debug/ctf/ctf-attr-mode-1.c | 22 + .../gcc.dg/debug/ctf/ctf-attr-used-1.c | 22 + .../gcc.dg/debug/ctf/ctf-bitfields-1.c | 30 + .../gcc.dg/debug/ctf/ctf-bitfields-2.c | 39 + .../gcc.dg/debug/ctf/ctf-bitfields-3.c | 16 + .../gcc.dg/debug/ctf/ctf-bitfields-4.c | 19 + .../gcc.dg/debug/ctf/ctf-complex-1.c | 22 + .../gcc.dg/debug/ctf/ctf-cvr-quals-1.c | 65 + .../gcc.dg/debug/ctf/ctf-cvr-quals-2.c | 30 + .../gcc.dg/debug/ctf/ctf-cvr-quals-3.c | 25 + .../gcc.dg/debug/ctf/ctf-cvr-quals-4.c | 23 + gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c | 21 + gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-2.c | 27 + .../gcc.dg/debug/ctf/ctf-file-scope-1.c | 25 + gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c | 16 + .../gcc.dg/debug/ctf/ctf-forward-1.c | 40 + .../gcc.dg/debug/ctf/ctf-forward-2.c | 16 + .../gcc.dg/debug/ctf/ctf-func-index-1.c | 25 + .../debug/ctf/ctf-function-pointers-1.c | 24 + .../debug/ctf/ctf-function-pointers-2.c | 22 + .../debug/ctf/ctf-function-pointers-3.c | 21 + .../gcc.dg/debug/ctf/ctf-functions-1.c | 34 + gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c | 17 + .../gcc.dg/debug/ctf/ctf-objt-index-1.c | 30 + .../gcc.dg/debug/ctf/ctf-pointers-1.c | 26 + .../gcc.dg/debug/ctf/ctf-pointers-2.c | 25 + .../gcc.dg/debug/ctf/ctf-preamble-1.c | 11 + .../gcc.dg/debug/ctf/ctf-skip-types-1.c | 33 + .../gcc.dg/debug/ctf/ctf-skip-types-2.c | 17 + .../gcc.dg/debug/ctf/ctf-skip-types-3.c | 20 + .../gcc.dg/debug/ctf/ctf-skip-types-4.c | 19 + .../gcc.dg/debug/ctf/ctf-skip-types-5.c | 19 + .../gcc.dg/debug/ctf/ctf-skip-types-6.c | 18 + .../gcc.dg/debug/ctf/ctf-str-table-1.c | 26 + gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c | 25 + gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c | 32 + .../gcc.dg/debug/ctf/ctf-struct-array-1.c | 65 + .../gcc.dg/debug/ctf/ctf-struct-pointer-1.c | 21 + .../gcc.dg/debug/ctf/ctf-struct-pointer-2.c | 22 + .../gcc.dg/debug/ctf/ctf-typedef-1.c | 68 + .../gcc.dg/debug/ctf/ctf-typedef-2.c | 20 + .../gcc.dg/debug/ctf/ctf-typedef-3.c | 24 + .../gcc.dg/debug/ctf/ctf-typedef-struct-1.c | 14 + .../gcc.dg/debug/ctf/ctf-typedef-struct-2.c | 17 + .../gcc.dg/debug/ctf/ctf-typedef-struct-3.c | 32 + gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c | 14 + .../gcc.dg/debug/ctf/ctf-variables-1.c | 25 + .../gcc.dg/debug/ctf/ctf-variables-2.c | 16 + gcc/testsuite/gcc.dg/debug/ctf/ctf.exp | 41 + gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c | 7 + gcc/toplev.c | 21 +- include/ctf.h | 513 ++++++ libiberty/simple-object.c | 3 + 76 files changed, 4862 insertions(+), 11 deletions(-) create mode 100644 gcc/ctfout.c create mode 100644 gcc/ctfout.h create mode 100644 gcc/dwarf2ctf.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-struct-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-union-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-attr-mode-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-attr-used-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-complex-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-file-scope-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-func-index-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-functions-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-objt-index-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-preamble-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-5.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-6.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-str-table-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-pointer-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-pointer-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf.exp create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c create mode 100644 include/ctf.h -- 2.25.0.2.g232378479e