Re: Designs for better debug info in GCC

Alexandre Oliva Wed, 19 Dec 2007 22:10:57 -0800

On Dec 19, 2007, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

> Alexandre Oliva <[EMAIL PROTECTED]> writes:
>> You snipped (skipped?) one aspect of the reasoning on why it is
>> appropriate.  Of course this doesn't prove it's the best possibility,
>> but I haven't seen evidence of why it isn't.


> You will find it easier to demonstrate the worth of your proposal if
> you act publically as though your interlocutors are people of good
> will, even when it doesn't seem that way to you, and omit
> interjections like "(skipped?)".

Sorry, I didn't mean it in a demeaning tone.  I realize I should have
been more careful, given the heat of the debate, for which I
apologize.

It just so happens that I'm just used to having texts I write skimmed
through rather than read in detail, so, when someone makes a point
that appears to disregard something that I write about, I tend to
assume that the person missed the portion in which I discussed it.
That was what the 'skipped?' was about.  I know I tend to pack too
much information in small spaces when I write (and I'm not proud of
it, mind you :-), so having readers miss points I did try to address
is unfortunately quite common.

Again, I apologize for not realizing this could be interpreted in a
different way than the one I meant.  It was indeed inappropriate.

> To be sure we are on the same page, I think your argument here is that
> with this code:

> int f(int x, int y) {
>   int i = 0, j = 0;

>   probe1();
>   i = x;
>   j = y;
>   probe2();
>   if (x < y)
>     i += y;
>   else
>     j -= x;
>   probe3();
>   return g (i ,j);
> }

> if I set a breakpoint just before the call to probe2(), and I print
> the values of 'i' and 'j', I should get the values of 'x' and 'y'.
> That is, you want to emit a DWARF variable note at that point that the
> value of 'i' can be found in the location corresponding to 'x'.

Yep.  That would be correct and complete.  It would also be
acceptable, but undesirable, to emit information to the effect that
the locations of 'i' and 'j' are unknown at those points; for this
would be correct, even if incomplete.

> Of course there are no actual instructions between the calls to
> probe1() and probe2().  If I use gdb's "finish" command out of
> probe1(), what values should I see for 'i' and 'j' at that point?
> Arguably I am now before the assignment statements, and should see '0'
> and '0', the values that 'i' and 'j' have before they are changed.  Of
> course, this is the same location as the breakpoint before probe2(),
> and we can't see both '0'/'0' and 'x'/'y'.  So it seems to me that
> this situation is actually somewhat ambiguous.  I don't see an
> obviously correct answer.

Dan has dealt with this point, but, if it floats your boat, you can
disregard any hope of getting it right between probe1() and probe2(),
since there aren't instructions in between them, and focus on getting
it right at probe2() or while probe2() is active in a lower stack
frame.

> I think the general issue you are describing is how to handle an
> assignment which appears in user code but which has been eliminated
> during optimization.

Yes, this is a way to describe it.

I'm addressing this in a bit more detail in a revised version of the
spec, that I intend to publish in the GCC wiki RSN.

> It seems to me that such eliminated assignments are inherently
> ambiguous.  If the assignment is gone, then there is a point in the
> generated code where the variable logically has both the old and the
> new values.  I assume that the debugger can only display one value.
> Which one should it be?

I don't think this characterization is correct.  There are points that
are logically before the removed assignment, and there are points that
are logically after it.  If we actually emitted a nop for the removed
assignment, then we could single-step through it and observe the
change in the logical variable even though no observable change
occurred in the program state (other than the advance of the PC past
this nop).  Except that, in the implementation plan I have in mind,
the observable change would quite often be from "unknown value" to
"assigned value", because the location holding the previous value will
likely have already been overwritten when we reach the debug insn.

> Consider a series of assignments to a local variable, and suppose
> that all the assignments are deleted becaues they are unused.  Are
> there dependencies between the DEBUG notes which keep them in the
> right order?

There ought to be, for sure, such that the last one prevails.

> Presumably we do not have the goal of emitting correct debug
> information in between line notes

I do.  Stack traces, for one, are seldom taken at line note
boundaries, for stack frames other than the top active one.  If we
didn't have correct debug information at those points, monitors
wouldn't be able to do a correct job.  Going from that to backtraces
that cross signal handling frames makes it only slightly more complex,
from a theoretical standpoint.  I.e., I don't see that solving the
problem such that it addresses the apparently-simpler requirement
would take significantly less implementation effort than solving the
apparently-more-complex requirement.

> I wonder whether it would be feasible for the debug info generation to
> work from the assignments in the source code as generated by the
> frontend.  For each assignment, we would find the corresponding line
> note.  Then we would look at the right hand side, and try to identify
> where that value could be found at that point in the program.  This
> would be a variant of our current variable tracking pass.  I haven't
> thought about this enough to know whether it would really work.

I've been giving something along these lines some thought, but it's a
bit more elaborate, and I'm not ready to present even a draft of my
thoughts on this topic.  And I unfortunately may have to discuss it
with lawyers before I can do anything concrete about it.

> That will only work correctly if sched-deps.c introduces dependencies
> between debug insns and real insns.

Yep, it does, have a look at the vta branch.  In fact, sched is the
pass that has given me the most headaches to get bootstrap-debug to
pass.

> If you introduce those dependencies, I don't understand how you will
> avoid changing the schedulers behaviour in the presence of debug
> insns.  How did you work around that problem?

Debug insns don't use any actual machine resources, and they sort of
always fit, so the scheduler can accept them as soon as they become
ready, without changing any other internal state.  I haven't
introduced explicit deps among debug insns, because I get the
impression that they're implied by the original instruction order and
the fact that, if two debug insns become simultaneously ready, there's
nothing that would reorder them (sorting is stable).

That said, I'm pretty sure I still have some scheduling issues to sort
out.  Trying to get bootstrap-debug to pass on ppc64 and ia64 has
exposed a number of scheduling issues, but IIRC almost all of them
were in the machine-specific scheduling code, that needed adjusting to
tolerate debug insns without internal state changes.  But I may still
be missing additional tweaks to the machine-independent scheduling
code.

> Personally, I would like to see that testsuite first.  That will give
> us an operational definition to aim for, rather than a theoretical
> discussion which I find to be ambiguous.

The two examples at the end of the design document are sort of meant
as a starting point for the testsuite.  As we discuss further
interesting examples, I'll probably add them, if not to the document,
to some collection of interesting debug info testcases.

I'm not ready to spend time figuring out the precise incantations to
automate these tests yet, but contributions along these lines would
obviously be welcome.  As for myself, I need to complete the design of
the GVN-like algorithm to turn RTL debug insns into var tracking
notes, that's currently underspecified.  Once that's done, we'll be
able to start testing things more seriously, and polishing the
heuristics that are going to be needed to decide between lvalue
location or rvalue for variables, partitioning lvalues that happen to
be in the same value equivalence classes into different user
variables, this sort of stuff.  I think this will take some
experimentation to get a reasonable idea of what is right, or at least
reasonable.

> And it will avoid the problem of turning the testsuite into a
> regression testsuite rather than an accuracy testsuite.

Sorry, I don't understand what you mean here.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: Designs for better debug info in GCC

Reply via email to