On Thu, May 23, 2019 at 10:31 PM Indu Bhagat <indu.bha...@oracle.com> wrote: > > > > On 05/22/2019 02:04 AM, Richard Biener wrote: > > The CTF debug information is kept in a CTF container distinct from the > frontend > structures. HashMaps are used to avoid generation of duplicate CTF and to > book-keep the generated CTF. > > OK. So I wonder how difficult it is to emit CTF by walking dwarf2outs own > data structures? That is, in my view CTF should be emitted by > dwarf2out_early_finish () (which is also the point LTO type/decl debug > is generated from). It would be nice to avoid extra bookkeeping data > structures > for CTF since those of DWARF should include all necessary information already. > > CTF format has some characteristics which make it necessary to "pre-process" > the generated CTF data before asm'ing out into a section. E.g. few cases of > why > "pre-processing" CTF is required before asm'ing out : > 1. CTF types do need to be emitted in "some" order : > CTF types can have references to other CTF types. This consequently > implies > that the referenced type must appear BEFORE the referring type. > 2. CTF preamble holds offsets to the various subsections - function info, > variables, data types and CTF string table. To calculate the offsets, the > compiler needs to know the size in bytes of these sub-sections. CTF > representation for some types like structures, functions, enums have > variable length of bytes trailing them (depending on the defintion of the > type). > 3. CTF variable entries need to be placed in the order of the names. > > Because of some of these "features" of the CTF format, the compiler does need > to do a transition from runtime CTF generated data --> CTF binary data format > for a clean and readable code. > > So, I think the needs are different enough to vouch for an implementation > segregated from dwarf* codebase. > > > Btw, do I read the CTF document posted to the binutils list (but not > cross-referenced > here :/) correctly in that you only want CTF debug for objects defined > in the file and > type information for the types refered to from that? At > > Yes. CTF is emitted for types at file-scope and global-scope only. Types, > vars > at function-scope should be skipped. > > dwarf2out_early_finish time > it isn't fully known which symbols will end up being emitted (and with > LTO you only > would know at link time). > > In nutshell, I am processing all decl at early_global_decl () time except > TYPE_DECL (Similar to DWARF, based on the thinking that if they are required > they will be reached at via other DECL). > In addition, I process all decl at type_decl () time except function-scope, > no-name decl, builtins. > > Currently, it does look like CTF for possibly to-be-omitted symbols will be > generated... I assume even DWARF needs to handle this case. Can you point me > to > how DWARF does this ?
It emits the debug information. DWARF outputs a representation of the source, not only emitted objects. We prune some "unused" bits if the user prefers us to do that but we do not omit information on types or decls that are used in the source but later eventually optimized away. > It seems to me that linker support to garbage collect > unused entries would be the way to go forward (probably easy for the > declarations > but not so for the types)? > > Hmm, garbage collecting unused types in linker - Let me get back to you on > this. It does not look easy. Decl should be doable though. For example DWARF has something like type units that can be refered to via hashes. GCC can output those into separate sections and I can envision outputting separate debug (CTF) sections for each declaration. The linker could then merge sections for declarations that survived and pick up all referenced type sections. Restrictions on ordering for CTF may make this a bit difficult though, essentially forcing a separate intermediate "unlinked" format and the linker regenerating the final one. OTOH CTF probably simply concatenates data from different CUs? Richard.