[PATCH 0/3] RFC: Let debug stmts influence codegen at -Og

Richard Sandiford Sun, 23 Jun 2019 06:51:43 -0700

-Og is documented as:

  @option{-Og} should be the optimization
  level of choice for the standard edit-compile-debug cycle, offering
  a reasonable level of optimization while maintaining fast compilation
  and a good debugging experience.  It is a better choice than @option{-O0}
  for producing debuggable code because some compiler passes
  that collect debug information are disabled at @option{-O0}.


One of the things hampering that is that, as for the "normal" -O* flags,
the code produced by -Og -g must be the same as the code produced
without any debug IL at all.  There are many cases in which that makes
it impossible to stop useful values from being optimised out of the
debug info, either because the value simply isn't available at runtime
at the point that the debugger needs it, or because of limitations in
the debug representation.  (E.g. pointers to external data are dropped
from debug info because the relocations wouldn't be honoured.)

I think it would be better to flip things around so that the debug IL
is always present when optimising at -Og, and then allow the debug IL
to influence codegen at -Og.  This still honours the -fcompare-debug
principle, and the compile speed of -Og without -g doesn't seem very
important.

This series therefore adds a mode in which debug stmts and debug insns
are present even without -g and are explicitly allowed to affect codegen.
In particular, when this mode is active:

- uses in debug binds become first-class uses, acting like uses in
  executable code

- the use of DEBUG_EXPR_DECLs is banned.  If we want to refer to
  a temporary value in debug binds, we need to calculate the value
  with executable code instead

This needs a new term to distinguish stmts/insns that affect codegen
from those that don't.  I couldn't think of one that I was really
happy with, but possibilities included:

    tangible/shadow
    manifest/hidden
    foreground/background
    reactive/inert
    active/dormant   (but "active insn" already means something else)
    peppy/sullen

The series uses tangible/shadow.  There's a new global flag_tangible_debug
that controls whether debug insns are "tangible" insns (for the new mode)
or "shadow" insns (for normal optimisation).  -Og enables the new mode
while the other optimisation levels leave it off.  (Despite the name,
the new variable is just an internal flag, there's no -ftangible-debug
option.)

The first patch adds the infrastructure but doesn't improve the debug
experience much on its own.

As an example of one thing we can do with the new mode, the second patch
ensures that the gimple IL has debug info for each is_gimple_reg variable
throughout the variable's lifetime.  This fixes a couple of the PRs in
the -Og meta-bug and from spot-testing seems to ensure that far fewer
values are optimised out.

Also, the new mode is mostly orthogonal to the optimisation level
(although it would in effect disable optimisations like loop
vectorisation, until we have a way of representing debug info for
vectorised loops).  The third patch therefore adds an -O1g option
that optimises more heavily than -Og but provides a better debug
experience than -O1.

I think -O2g would make sense too, and would be a viable option
for people who want to deploy relatively heavily optimised binaries
without compromising the debug experience too much.

Other possible follow-ons for the new mode include:

- Make sure that tangible debug stmts never read memory or take
  an address.  (This is so that addressability and vops depend
  only on non-debug insns.)

- Fall back on expanding real code if expand_debug_expr fails.

- Force debug insns to be simple enough for dwarf2out (e.g. no external
  or TLS symbols).  This could be done by having a validation step for
  debug insns, like we already do for normal insns.

- Prevent the removal of dead stores if it would lead to wrong debug info.
  (Maybe under control of an option?)

To get an idea of the runtime cost, I tried compiling tree-into-ssa.ii
at -O2 -g with various --enable-checking=yes builds of cc1plus:

                            time taken
cc1plus compiled with -O0:     100.00%   (baseline)
cc1plus compiled with old -Og:  30.94%
cc1plus compiled with new -Og:  31.82%
cc1plus compiled with -O1g:     28.22%
cc1plus compiled with -O1:      26.72%
cc1plus compiled with -O2:      25.15%

So there is a noticeable but small performance cost to the new mode.

To get an idea of the compile-time impact, I tried compiling
tree-into-ssa.ii at various optimisation levels, all using the
same --enable-checking=release bootstrap build:

                              time taken
tree-into-ssa.ii with -O0 -g:     100.0%  (baseline)
tree-into-ssa.ii with old -Og -g: 180.6%
tree-into-ssa.ii with new -Og -g: 198.2%
tree-into-ssa.ii with -O1g -g:    237.1%
tree-into-ssa.ii with -O1 -g:     211.8%
tree-into-ssa.ii with -O2 -g:     331.5%

So there's definitely a bit of a compile-time hit.  I haven't yet looked
at how easy it would be to fix.

What do you think?  Is it worth pursuing this further?

Of course, even if we do do this, it's still important that the debug
info for things like -O2 -g is as good as it can be.  I just think some
of the open bugs against -Og fundamentally can't be fixed properly while
-Og remains a cut-down version of -O1.

Thanks,
Richard

[PATCH 0/3] RFC: Let debug stmts influence codegen at -Og

Reply via email to