On Dec 18, 2007, Alexandre Oliva <[EMAIL PROTECTED]> wrote: > On Dec 17, 2007, Diego Novillo <[EMAIL PROTECTED]> wrote: >> On 12/17/07 19:50, Alexandre Oliva wrote: >>> Now, since you're so interested in it and you've already read the >>> various perspectives on the issue that I listed in my yesterday's >>> e-mail to you, would you help me improve this document, by letting me >>> know what you believe to be missing from the selected postings on >>> design strategies, rationales and goals:
>> No. I am not interested in organizing your thoughts for you. > Wow, nice shot! Rats, this below-the-waistline attack really got me annoyed. So annoyed that I spent the night writing up this consolidated design document. So, what do you say now?
A plan to fix local variable debug information in GCC by Alexandre Oliva <[EMAIL PROTECTED]> 2007-12-18 draft == Introduction The DWARF Debugging Information Format, version 3, determines the ways a compiler can communicate the location of user variables at run time to debug information consumers such as debuggers, program analysis tools, run-time monitors, etc. One possibility is that the location of a variable is fixed throughout the execution of a function. This is generally good enough for unoptimized programs. However, for optimized programs, the location of a variable can vary. The variable may be live for some parts of a function, even in multiple locations simultaneously. At other parts, it may be completely unavailable, or it may still be computable even if no location actually holds its value. The encoding, in these cases, can be a location list: tuples with possibly-overlapping ranges of instructions, and location expressions that determine a location or a value for the variable. Historically, GCC started with the simpler, fixed-location model. In fact, back then, there weren't debug information formats that could represent anything better than this. More recently, GCC gained code to keep track of varying locations, and to emit debug information accordingly. Unfortunately, very many optimization passes discard information that would be necessary to emit correct and complete variable location lists. Coalescing, scalarizing, substituting, propagating, and many other transformations prevent the late-running variable tracker from doing an accurate job. By the time it runs, many variables no longer show up in the retained annotations, although they're still conceptually available. The variable tracker can't tell when a user variable overlaps with another, and it can't tell when a variable is overwritten, if the assignment is optimized away. These limitations are inherent to a model based on inspecting actual code and trying to make inferences from that. In order to be able to represent not only what remained in the code, but also what was optimized, combined or otherwise apparently-removed, additional information needs to be kept around. This paper describes an approach to maintain this information. == Goals * Ensure that, for every user variable for which we emit debug information, the information is correct, i.e., if it says the value of a variable at a certain instruction is at certain locations, or is a known constant, then the variable must not be at any other location at that point, and the locations or values must match reasonable expectations based on source code inspection. * Defining "reasonable expectations" is tricky, for code reordering typical of optimization can make room for numerous surprises. I don't have a precise definition for this yet, but very clearly to me saying that a variable holds a value that it couldn't possibly hold (e.g., because it is only assigned that value in a code path that is knowingly not taken) is a very clear indication that something is amiss. The general guiding rule is, if we aren't sure the information is correct (or we're sure it isn't), we shouldn't pretend that it is. * Try to ensure that, if the value of a variable is a known constant at a certain point in the program, this information is present in debug information. * Try to ensure that, if the value of a variable is available or computable at any location at a certain point in the program, this information is present in debug information. * Stop missing optimizations for the sake of preserving debug information. * Avoid using additional memory and CPU cycles that would be needed only for debug information when compiling without generating debug information == Internal Representation For historical reasons, GCC has two completely different, even if nearly isomorphic, internal representations: trees and RTL. This decision has required a lot of code to be duplicated for low-level manipulation and simplification of each of these representations. Since tracking variables and their values must start early to ensure correctness, and be carried throughout the complete optimization process, it might seem tempting to introduce yet another representation for debug information, decaying both isomorphic representations into a single debug information representation. The drawbacks would be additional duplication of internal representation manipulation code, and the possibility of increasing memory use out of the need for representing information in yet another format. Another concern is that even the simplest compiler transformations may need to be reflected in debug information. This might indicate a need for modifying every point of transformation in every optimization pass so as to propagate information into the debug information representation. This is undesirable, because it would be very intrusive. But then, keeping references to the correct values, expressions or variables, as transformations are made, is precisely what optimization passes have to do to perform their jobs correctly. Finding a way to take advantage of this is a very non-intrusive way of keeping debug information accurate. In fact, most transformations wouldn't need any changes whatsoever: uses of variables in debug information can, in most optimization passes, be handled just like any other uses. Once this is established, a possible representation becomes almost obvious: statements (in trees) or instructions (in rtl) that assert, to the variable tracker, that a user variable or member is represented by a given expression: # DEBUG var expr By var, we mean a tree expression that denotes a user variable, for now. We envision trivially extending it to support components of variables in the future. By expr, we mean a tree or rtl expression that computes the value of the variable at the point in which the statement or instruction appears in the program. A special value needs to be specified for each representation that denotes a location or value that cannot be determined or represented in debug information, for example, the location of a variable that was completely optimized away. It might be useful to represent the expression as a list of expressions, and to distinguish lvalues from rvalues, but for now let's keep this simple. == Generating debug information Generating initial annotations when entering SSA is early enough in the translation that the program will still reflect very reliably the original source code. Annotations are only generated for user variables that are GIMPLE registers, i.e., variables that represent scalar values and that never have their address taken. Other kinds of variables don't have varying locations, so we don't need to worry about them. After every assignment to such a variable, we emit a DEBUG statement that will preserve, throughout compilation, the information that, at that point, the assigned variable was represented by that expression. So, after turning an assignment such as the following into SSA form, we emit the debug statement below right after it: x_1 = whatever; # DEBUG x x_1 Likewise, at control flow merge points, for each PHI node we introduce in the SSA representation, we emit an annotation: # x_4 = PHI <x_1(3), x_2(4), x_3(7)>; # DEBUG x x_4 Then, we let tree optimizers do their jobs. Whenever they rename, renumber, coalesce, combine or otherwise optimize a variable, they will automatically update debug statements that mention them as well. In the rare cases in which the presence of such a statement might prevent an optimization, we need to adjust the optimizer code such that the optimization is not prevented. This most often amounts to skipping or otherwise ignoring debug statements. In a few very rare cases, special code might be needed to adjust debug statements manually. After transformation to RTL, the representation needs translation, but conceptually it's still the same: a mapping from variable to expression. Again, optimizers will most often adjust debug instructions automatically. The exceptions can be handled at no cost: the test for whether an element of the instruction stream is an instruction or some kind of note, that never needs updating, is a range test, in its optimized form. By placing the identifier for a debug instruction at one of the limits of this range, testing for both ranges requires identical code, except for the constants. Since most code that tests for INSN_P and handles instructions can and should match debug instructions as well, in order to keep them up to date, we extend INSN_P so as to match debug instructions, and modify the exceptions, that need to skip debug instructions, by using an alternate test, with the same meaning as the original definition of INSN_P. These simple and non-intrusive changes are relatively common, but still, by far, the exception rather than the rule. When optimizations are completed, including register allocation and scheduling, it is time to pick up the debug instructions and emit debug information out of them. Conceptually, the debug instructions represent points of assignment, at which a user variable ought to evaluate to the annotated expression, maintained throughout compilation. However, when the value of a variable is live at more than one location, it is important to note it, such that, if a debugging session attempts to modify the variable, all copies are modified. The idea is to use some mechanism to determine equivalent expressions throughout a function (say some variant of Global Value Numbering). At debug instructions, we assert that the value of the named variable is in the equivalence class represented by the expression. As we scan basic blocks forward and find that expressions in an equivalence class are modified, we remove them from the equivalence class, and thus from the list of available locations for the variable. When such expressions are further copied, we add them to equivalence classes. At function calls and volatile asm statements, we remove non-function-private memory slots from equivalence classes. At function calls, we also remove call-clobbered registers from equivalence classes. When no live expression remains in the equivalence class that represents a variable, it is understood that its value is no longer available. At basic block confluences, we combine information from the end states of the incoming blocks and the debug statements added as a side effect of PHI nodes. The end result is accurate debug information. Also, except for transformations that require special handling to update debug annotations properly, debug information should come out as complete as possible. == Testability Since debug annotations are added early, and, in most cases, maintained up-to-date by the same code that optimizers use to maintain executable code up-to-date, debug annotations are likely to remain accurate throughout compilation. The risk of this approach is that the annotations get in the way of optimizations, thus causing executable code to vary depending on whether or not debug information is to be generated. The risk of varying code could be removed at the expense of generating and maintaining debug annotations throughout compilation and just throwing them away at the end. This is undesirable, for it would slow down compilation without debug information and waste memory while at that. Therefore, we've built testing mechanisms into the compiler to detect cases in which the presence of debug annotations would cause code changes. The bootstrap-debug Makefile target, by default, compiles the second bootstrap stage without debug information, and the third bootstrap stage with it, and then compares all object files after stripping them, a process that discards all debug information. Furthermore, bootstrap4-debug, after bootstrap-debug and prepare-bootstrap4-debug-lib-g0, rebuilds all target libraries without debug information, and compares them with the stage3 target libraries, built with debug information. At the time of this writing, both tests pass on platforms x86_64-linux-gnu and i686-linux-gnu, and ppc64-linux-gnu and ia64-linux-gnu are getting close. Additional testing mechanisms should be built in, to exercise a wider range of internal GCC behaviors and extensions, for example, by comparing the compiler output with and without debug information while compiling all of its testsuite. Even if testing mechanisms fail to catch an error, the generation of debug annotations is controlled by a command-line option, such that any code changes caused by it can be easily avoided, at the expense of the quality of the debug information. Testing for accuracy and completeness of debug information can be best accomplished using a debugging environment. For example, writing programs of increasing complexity, adding functional-call or asm probe points to stabilize the internal execution state, and then examining the state of the program at these probe points in a debugger, shall let us know how accurate and how complete variable location information is. Measuring accuracy is easy: if you ask for the value of a variable, and get a value other than the expected, there's a bug in the compiler. If you get "unavailable", this can still be regarded as accurate, for locations are always optional. However, it might be incomplete. Telling whether the variable was indeed optimized away, or whether the value is available or computable but the information is missing, is a harder problem, but it's not part of the accuracy test, but rather of the completeness test. The completeness score for an unoptimized program might very often be unachievable for optimized programs, not because the compiler is doing a poor job at maintaining debug information, but rather because the compiler is doing a good job at optimizing it, to the point that it is no longer possible to determine the value of the inspected variable. == Concerns === Memory consumption Keeping more information around requires more memory; information theory tells us that there's only so much information you can fit in a bit. In order to generate correct debug information, more information needs to be retained throughout compilation. The only way to arrange for debug information to not require any additional memory is to waste memory when not generating debug information. But this is undesirable. Therefore, the better debug information we want, the more memory overhead we're going to have to tolerate. Of course at times we can trade memory for efficiency, using more computationally expensive representations that are more compact. At other times, we may trade memory for maintainability. For example, instead of emitting annotations as soon as we enter SSA mode, we could emit them on demand, i.e., whenever we deleted, moved or significantly modified an SSA assignment for which we would have emitted a debug annotation. Additional memory would be needed to mark assignments that should have gained annotations but haven't, and care must be taken to make sure that transformations aren't made without leaving a correct debug statement in place. It is not clear that this would save significant memory, for a large fraction of relevant assignments are modified or moved anyway, so it might very well be a maintainability loss and a performance penalty for no measurable memory gains. Worst case, we may trade memory for debug information quality: if memory use of this scheme is too high for some scenario, one can disable debug information annotations through a command line option, or disable debug information altogether. === Intrusiveness Given that nearly all compiler transformations would require reflection in debug information, any solution that doesn't take advantage of this fact is bound to require changes all over the place. Perhaps not so much for Tree-SSA passes, that are relatively well-behaved and use a narrow API to make transformations, but very clearly so for RTL passes, that very often modify instructions in place, and at times even reuse locations assigned to user variables as temporaries. Even when we do use the strength of optimizers to maintain debug information up to date, there are exceptions in which detailed knowledge about the transformation taking place enables us to adjust the annotations properly, if possible, or to discard location information for the variable otherwise. It is just not possible to hope that information can be maintained accurate throughout compilation without any effort from optimizers, or even through a trivial API for a debug information generator. A number of the exceptions that require detailed knowledge about the ongoing transformation would be indistinguishable from other common transformations that would have very different effects on debug information. At this point, any expectations of lower intrusiveness by use of such an API vanish. By letting optimizers do their jobs on debug annotations, and handling exceptions only at the few locations where they are needed, trivially in most such cases, we keep intrusiveness at a minimum. Of course we could get even lower intrusiveness by accepting errors in debug information, or accepting to generate different code depending on debug information command-line options. But these options shouldn't be considered seriously. === Complexity The annotations are conceptually trivial and they can be immediately handled by optimizers. It is hard to imagine a simpler design that would still enable us to get right cases such as those in the examples below. Worrying about the representation of debug annotations as statements or instructions, rather than notes, is missing the fact that, most of the time, we do want them to be updated just like statements and instructions. Worrying about the representation of debug annotations in-line, rather than an on-the-side representation, is a valid concern, but it's addressed by the testability of the design, and the in-line representation is highly advantageous, not only for using optimizers to keep debug information accurate, but also for doing away with the need for yet another internal representation and all the efforts into maintaining it accurate. === Optimizations Correct and more complete debugging information isn't supposed to disable optimizations. Keep in mind that enabling debug information isn't supposed to modify the executable code in any way whatsoever. The goal is to ensure that whatever debug information the compiler generates actually matches the executable code, and that it is as complete as viable. The goal is not to disable optimizations so as to preserve variables or code, such that it can be represented in debug information and provide for a debugging experience more like that of code that is not optimized. If debug information disables any optimization, that's a bug that needs fixing. Now, while testing this design, a number of opportunities for optimization that GCC missed were detected and fixed, others were merely detected, and at least one optimization shortcoming kept in place in order to get better debug information could be removed, for the new debug information infrastructure enables the optimization to be applied in its fullest extent. == Examples It is desirable to be able to represent constants and other optimized-away values, rather than stating variables have values they can no longer have: int x1 (int x) { int i; i = 2; f(i); i = x; h(); i = 7; g(i); } Even if variable i is completely optimized away, a debugger can still print the correct values for i if we keep annotations such as: (debug (var_location i (const_int 2))) (set (reg arg0) (const_int 2)) (call (mem (symbol_ref f))) (debug (var_location i unknown)) (call (mem (symbol_ref h))) (debug (var_location i (const_int 7))) (set (reg arg0) (const_int 7)) (call (mem (symbol_ref g))) In this case, before the call to h, not only the assignment to i was dead, but also the value of the incoming argument x had already been clobbered. If i had been assigned to another constant instead, debug information could easily represent this. Another example that covers PHI nodes and conditionals: int x2 (int x, int y, int z) { int c = z; whatever0(c); c = x; whatever1(); if (some_condition) { whatever2(); c = y; whatever3(); } whatever4(c); } With SSA infrastructure, this program can be optimized to: int x2 (int x, int y, int z) { int c; # bb 1 whatever0(z_0(D)); whatever1(); if (some_condition) { # bb 2 whatever2(); whatever3(); } # bb 3 # c_1 = PHI <x_2(D)(1), y_3(D)(2)>; whatever4(c_1); } Note how, without debug annotations, c is only initialized just before the call to whatever4. At all other points, the value of c would be unavailable to the debugger, possibly even wrong. If we were to annotate the SSA definitions forward-propagated into c versions as applying to c, we'd end up with all of x_2, y_3 and z_0 applied to c throughout the entire function, in the absence of additional markers. Now, with the annotations proposed in this paper, what is initially: int x2 (int x, int y, int z) { int c; # bb 1 c_4 = z_0(D); # DEBUG c c_4 whatever0(c_4); c_5 = x_2(D); # DEBUG c c_5 whatever1(); if (some_condition) { # bb 2 whatever2(); c_6 = y_3(D); # DEBUG c c_6 whatever3(); } # bb 3 # c_1 = PHI <c_5(D)(1), c_6(D)(2)> # DEBUG c c_1 whatever4(c_1); } is optimized into: int x2 (int x, int y, int z) { int c; # bb 1 # DEBUG c z_0(D) whatever0(z_0(D)); # DEBUG c x_2(D) whatever1(); if (some_condition) { # bb 2 whatever2(); # DEBUG y_3(D) whatever3(); } # bb 3 # c_1 = PHI <x_2(D)(1), y_3(D)(2)>; # DEBUG c c_1 whatever4(c_1); } and then, at every one of the inspection points, we get the correct value for variable c. == Conclusion This design enables a compiler to emit variable location debug information that complies with the DWARF version 3 standard, and that is likely to be as complete as theoretically possible, with an implementation that is conceptually simple, relatively easy to introduce, trivial to test and easy to maintain in the long run. Not wasting memory or CPU cycles during compilation without debug information are welcome bonuses.
-- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}