Re: Type representation in CTF and DWARF

Richard Biener Fri, 25 Oct 2019 00:30:01 -0700

On Fri, Oct 25, 2019 at 1:52 AM Indu Bhagat <indu.bha...@oracle.com> wrote:
>
>
>
> On 10/11/2019 04:41 AM, Jakub Jelinek wrote:
> > On Fri, Oct 11, 2019 at 01:23:12PM +0200, Richard Biener wrote:
> >>> (coreutils-0.22)
> >>>        .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> >>> (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> >>> ls   30616           |    1136           |    21098       | 26240         
> >>>       | 0.62
> >>> pwd  10734           |    788            |    10433       | 13929         
> >>>       | 0.83
> >>> groups 10706         |    811            |    10249       | 13378         
> >>>       | 0.80
> >>>
> >>> (emacs-26.3)
> >>>        .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> >>> (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> >>> emacs-26.3.1 674657  |    6402           |   273963       |   273910      
> >>>       | 0.33
> >>>
> >>> I chose to account for 50% of .debug_str because at this point, it will be
> >>> unfair to not account for them. Actually, one could even argue that upto 
> >>> 70%
> >>> of the .debug_str are names of entities. CTF section sizes do include the 
> >>> CTF
> >>> string tables.
> >>>
> >>> Across coreutils, I see a geomean of 0.73 (ratio of
> >>> .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
> >>> "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
> >>> footprint than CTF (with 50% of .debug_str accounted for).
> >> I'm not convinced this "improvement" in size is worth maintainig another
> >> debug-info format much less since it lacks desirable features right now
> >> and thus evaluation is tricky.
> >>
> >> At least you can improve dwarf size considerably with a low amount of work.
> >>
> >> I suspect another factor where dwarf is bigger compared to CTF is that 
> >> dwarf
> >> is recording typedef names as well as qualified type variants.  But maybe
> >> CTF just has a more compact representation for the bits it actually 
> >> implements.
> > Does CTF record automatic variables in functions, or just global variables?
> > If only the latter, it would be fair to also disable addition of local
> > variable DIEs, lexical blocks.  Does CTF record inline functions?  Again, if
> > not, it would be fair to not emit that either in .debug_info.
> > -gno-record-gcc-switches so that the compiler command line is not encoded in
> > the debug info (unless it is in CTF).
>
> CTF includes file-scope and global-scope entities. So, CTF for a function
> defined/declared at these scopes is available in .ctf section, even if it is
> inlined.
>
> To not generate DWARF for function-local entities, I made a tweak in the
> gen_decl_die API to have an early exit when TREE_CODE (DECL_CONTEXT (decl))
> is FUNCTION_DECL.
>
> @@ -26374,6 +26374,12 @@ gen_decl_die (tree decl, tree origin, struct 
> vlr_context *ctx,
>     if (DECL_P (decl_or_origin) && DECL_IGNORED_P (decl_or_origin))
>       return NULL;
>
> +  /* Do not generate info for function local decl when -gdwarf-like-ctf is
> +     enabled.  */
> +  if (debug_dwarf_like_ctf && DECL_CONTEXT (decl)
> +      && (TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL))
> +    return NULL;
> +
>     switch (TREE_CODE (decl_or_origin))
>       {
>       case ERROR_MARK:


A better place is probably in gen_subprogram_die, returning early before

  /* Output Dwarf info for all of the stuff within the body of the function
     (if it has one - it may be just a declaration).

note we also emit DIEs for [optionally also unused, if requested] function
declarations without actual definitions, I would guess CTF doesn't since
there's no symbol table entry for those.  Plus we by default prune types
that are not used.  So

struct S { int i; };
extern void foo (struct S *);
void bar()
{
  struct S s;
  foo (&s);
}

would have DIEs for S and foo in addition to that for bar.  To me it seems
those are not relevant for function entry point inspection (eventually both
S and foo have CTF info in the defining unit).  Correct?

Richard.

>
> For the numbers in the email today:
> 1. CFLAGS="-g -gdwarf-like-ctf -gno-record-gcc-switches -O2". dwz is used on
>     generated binaries.
> 2. At this time, I wanted to account for .debug_str entities appropriately 
> (not
>     50% as done previously). Using a small script to count chars for
>     accounting the "path-like" strings, specifically those strings that start
>     with a ".", I gathered the data in column named D5.
>
> (coreutils-0.22)
>       .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings 
> (D5) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
> ls   14100           |    994            |    16945       | 1328              
> |   26240             | 0.85
> pwd   6341           |    632            |     9311       |  596              
> |   13929             | 0.88
> groups 6410          |    714            |     9218       |  667              
> |   13378             | 0.85
> Average geomean across coreutils = 0.84
>
> (emacs-26.3)
>       .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings 
> (D5) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
> emacs-26.3.1 373678  |    3794           |   219048       |  3842             
> |     273910          | 0.46
>
> > DWARF is highly extensible format, what exactly is and is not emitted is
> > something that consumers can choose.
> > Yes, DWARF can be large, but mainly because it provides a lot of
> > information, the actual representation has been designed with size concerns
> > in mind and newer versions of the standard keep improving that too.
> >
> >       Jakub
>
> Yes.
>
> I started out to provide some numbers around the size impact of CTF vs DWARF
> as it was a legitimate curiosity many of us have had. Comparing Compactness or
> feature matrices is only one dimension of evaluating the utility of supporting
> CTF in the toolchain (including GCC; Bintuils and GDB have already accepted
> initial CTF support). The other dimension is a user friendly workflow which
> supports current users and eases further adoption and growth.
>
> Indu
>

Re: Type representation in CTF and DWARF

Reply via email to