On Nov 7, 2007, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: > Alexandre Oliva <[EMAIL PROTECTED]> writes: >> I've pondered both alternatives, and decided that the latter was the >> only testable path. If we had a reliable debug information tester, we >> could proceed incrementally with the first alternative; it might be >> viable, but I don't really see that it would make things any simpler.
> It seems to me that this is a reason to write a reliable debug > information tester. Yep. This is in the roadmap. But it's not something that can be done with GCC alone. It's more of a "system" test, that will involve debuggers or monitoring tools. gdb, fryks, systemtap or some such come to mind. > Your approach gives you a point solution--did anything change > today--but it doesn't give us a maintenance solution--did anything > change over time? Actually, no, your assessment is incorrect. What I'm providing gives us means to test, at any point in time, that enabling debug information won't cause changes to the generated code. So far, code in the trunk only performs these comparisons within the GCC directory. And, nevertheless, patches that correct obvious divergences have been lingering for months. I have recently-posted patches that introduce means to test other host and target libraries. I still haven't written testsuite code to enable us to verify that debug information doesn't affect the generated code for existing tests, or for additional tests introduced for this very purpose, but this is in the roadmap. Of course, none of this guarantees that debug information is accurate or complete, it just helps ensure that -g won't change code generation. Testing more than this requires a tool that can not only interpret debug information, but also the generated code, and verify that they match. The plan is to use the actual processors (or simulators) to understand the generated code, and existing debug info consumers that are debugging or monitoring tools to verify that debug info reflects the behavior observed by the processor. > While I understand that you were given certain requirements, for the > purposes of mainline gcc we need to weigh costs and benefits. How > many of our users are looking for precise debugging of optimized code, > and how much are they willing to pay for that? Will our users overall > be better served by the 90% solution? Does it really matter? Do we compromise standards compliance (and so violently, while at that) in any aspect of the compiler? What do we tell the growing number of users who don't regard debug information as something useless except for occasional debugging? That GCC cares about standards compliant except for debug information, and they should write their own Free Software compiler if they want a correct, standards-compliant compiler? Do we accept taking shortcuts for optimizations or other code generation issues when they cause incorrect code to be produced? Why should the mantra "must not sacrifice correctness" not applicable to debug information standards in GCC? At this point, debug information is so bad that it's a shame that most builds are done with -O2 -g: we're just wasting CPU cycles and disk space, contributing to accelerate the thermodynamic end of the universe (nevermind the Kyoto protocol ;-), for information that is severely incomplete at best, and terribly broken at worst. Yes, generating correct code may take some more memory and some more CPU cycles. Have we ever made a decision to use less memory or CPU cycles when the result is incorrect code? Why should standardized meta-information about the generated code be any different? >> 1. every single gimple assignment grows by one word, I take this back, I'd been misled by richi's description. It's really a side hashtable (which gets me worried about the re-emitted rather than modified gimple assignments in some locations), so it doesn't waste memory for gimple assignments that don't refer to user variables. Unfortunately, this is not the case for rtx SETs, in this alternate approach. > I don't know what the best approach is for improving debug > information. Your phrasing seems to indicate you're not concerned about fixing debug information, but rather only about making it less broken. With different goals, we can come to very different solutions. > But I think we've learned over time that explicit NOTEs > in the RTL was not, in general, a good idea. They complicate > optimizations and they tend to get left behind when moving code. Being left behind is actually a feature. It's one of the reasons why I chose this representation. The debug annotation is not supposed to move along with the SET, because it would then no longer model the source code, it would rather be mangled, often beyond recognition, because of implementation details. As for complicating optimizations, I can have some sympathy for that. Sure, generating code without preserving the information needed to map source-level concepts to implementation-level concepts is easier. But generating broken code is not an option, it's a bug, so why should it be an acceptable option just because the code we're talking about is meta-information about the executable code? > We've fixed many many bugs and misoptimizations over the years due to > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > we've made in the past. That's a valid concern. However, per this reasoning, we might as well push every operand in our IL to separate representations, because there have been so many bugs and misoptimizations over the years, especially when the representation didn't make transformations trivially correct. However, the beauty of the representation I've chosen, that models the annotations as a weak USE of an expression that evaluates to the value of the variable at the point of assignment, most compiler passes *will* keep them accurate, where any other representation would have to be dealt with explicitly. Sure, some passes need to compensate to make sure these weak USEs don't affect codegen or optimizations, and a few need special tweaks to keep notes accurate, to stop the safeguards in place that would discard the information that went inaccurate. But these are few. I believe strongly that this is the correct trade-off. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}