Re: Designs for better debug info in GCC

Alexandre Oliva Wed, 07 Nov 2007 21:01:32 -0800

On Nov  7, 2007, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

>> Does it really matter?  Do we compromise standards compliance (and so
>> violently, while at that) in any aspect of the compiler?


> What standards are you talking about?

Debug information standards such as DWARF-3.

> I'm not aware of any standard for debuggability of optimized code.

I'm talking about standards that specify how a compiler should encode
meta-information about how source code concepts map to the code it
generated.  See, for example, section 2.6 in the Dwarf-3
specification.  It talks very little about optimization, but it does
discuss what a DW_AT_location, if present, means.  It doesn't say
anything like: "if a variable is available at a certain location most
of the time, you can emit a DW_AT_location that refers to that
location".  It says:

  Debugging information must provide consumers a way to find the
  location of program variables, determine the bounds of dynamic
  arrays and strings, and possibly to find the base address of a
  subroutine’s stack frame or the return address of a subroutine

See, it's not about debuggers, it's about consumers.  It's an
obligation, not really an option (that said, DW_AT_location *is*
optional).

  1. Location expressions, which are a language independent
     representation of addressing rules of arbitrary complexity built
     from DWARF expressions. They are sufficient for describing the
     location of any object as long as its lifetime is either static
     or the same as the lexical block that owns it, and it does not
     move throughout its lifetime.

  2. Location lists, which are used to describe objects that have a
     limited lifetime or change their location throughout their
     lifetime.

Nowhere does it state that, "if the compiler can't quite keep track of
the location of a variable, it can be sloppy and emit just whatever is
simpler or appears to make sense".

  Address ranges may overlap. When they do, they describe a situation
  in which an object exists simultaneously in more than one place. If
  all of the address ranges in a given location list do not
  collectively cover the entire range over which the object in
  question is defined, it is assumed that the object is not available
  for the portion of the range that is not covered.

So, it does make room for *some* sloppiness, after all.  That's what I
refer to as "incompleteness of debug information".  If we fail to keep
track of where an object is, it's sort-of ok (although undesirable) to
emit debug information that omits the location of the object in
certain program regions where it might be live.

However, it is not standard-compliant to emit information stating that
the object is available at certain locations if it is NOT really
there, or if it is available elsewhere, in addition to or instead of
the locations we've emitted.  That's what I refer to as "incorrectness
of debug information".

Incorrectness in the compiler output is always a bug.  No matter how
hard it is to implement, or how resource-intensive the solution is,
arguing that we've made a trade-off and decided to generate wrong
output for this case is a clever decision.

Incompleteness is a completely different issue.  This is where we
*can* afford to make trade-offs.  Just like we can decide to omit
certain optimizations, or to not carry them out to the greatest
possible extent, or to experiment with various different heuristics,
we could afford to emit incomplete debug information, it's "just" a
quality of implementation issue.  But not incorrect debug information,
that's just a bug.

> gcc's users are definitely calling for a faster compiler.  Are they
> calling for better debuggability of optimized code?

This is not just about debuggability, as I've tried to explain all the
way from the beginning of the discussion, maybe a couple of months
ago.  Debug information is not just about debuggers any more.  There
are good reasons why the Dwarf-3 standard says "consumers" rather than
"debuggers".  It's no longer just a matter of convenience, recompile
with -g0 if you want to debug it.  It's a matter of correctness, for
various monitoring tools now rely on this meta-information, and
rightfully so.

>> > We've fixed many many bugs and misoptimizations over the years due to
>> > NOTEs.  I'm concerned that adding DEBUG_INSN in RTL repeats a mistake
>> > we've made in the past.
>> 
>> That's a valid concern.  However, per this reasoning, we might as well
>> push every operand in our IL to separate representations, because
>> there have been so many bugs and misoptimizations over the years,
>> especially when the representation didn't make transformations
>> trivially correct.

> Please don't use strawman arguments.

It's not, really.  A reference to an object within a debug stmt or
insn is very much like any other operand, in that most optimizer
passes must keep them up to date.  If you argue for pushing them
outside the IL, why would any other operands be different?

> As I understand your proposal, it materializes variables which were
> otherwise omitted from the generated program.  It doesn't address the
> other issues with debugging optimized code, like bouncing around
> between program lines.  Is that correct?  What else does your proposal
> do?

All it does is to try to carry information about what value the user
is entitled to expect a variable to hold at each point in the program
throughout compilation.  Such that, even if the compiler doesn't
retain something that represents only that variable through to the end
of the compilation, we still have information about where, or at least
what, its value is, if it is available anywhere, such that we can
include this piece of data in the debug information.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: Designs for better debug info in GCC

Reply via email to