It's been a very long time since I started working in the var-tracking-assignments branch. It is finally approaching a state in which I'm comfortable enough to propose that it be integrated.
Alas, it's not quite finished yet and, unless it is merged, it might very well never be. New differences in final RTL between -g and -g0 keep popping up: I just found out one more that went in as recently as last week (lots of -O3 -g torture testsuite failures in -fcompare-debug test runs). After every merge, I spent an inordinate amount of time addressing such -fcompare-debug regressions. I'm not complaining. I actually enjoy doing that, it is fun. But it never ends, and it took time away from implementing the features that, for a long time, were missing. I think it's close enough to ready now that I feel it is no longer unfair to request others to share the burden of keeping GCC from emitting different code when given -g compared with -g0, a property we should have always ensured. == What is VTA? This project aims at getting GCC to emit debug information for local variables that is always correct, and as complete as possible. By correct, I mean, if GCC says a variable is at a certain location at a certain point in the program, that location must hold the value of the variable at that point. By complete, I mean if the value of the variable is available somewhere, or can be computed from values available somewhere, then debug information for the variable should tell the debug information consumer how to obtain or compute it. The key to keep the mapping between SL (source-level) variables and IR objects from being corrupted or lost was to introduce explicit IR mappings that, on the SL hand, remained stable fixed points and, on the IR hand, expressions that got naturally adjusted as part of the optimization process, without any changes to the optimization passes. Alas, no changes to the passes would be too good to be true. It was indeed true for several of them, but many that dealt with special boundary cases such as single or no occurrences of references to a value, or that counted references to make decisions, had to be taught how to disregard references that appeared in these new binding IR elements. Others had to be taught to disregard these elements when checking for the absence of intervening code between a pair of statements or instructions. In nearly all cases, the changes were trivial, and the need for them was shown in -fcompare-debug or bootstrap-debug testing. In a few cases, changes had to be more elaborate, for disregarding debug uses during analysis ended up requiring them to be explicitly adjusted afterwards. For example, substituting a set into its single non-debug use required adding code to substitute into the debug uses as well. In most of these cases, adjusting them would merely avoid loss of debug information. In a few, failing to do so could actually cause incorrect debug information to be output, but there are safety nets in place that avoid this in the SSA level, causing debug information to be dropped instead. Overall, the amount of changes to the average pass was ridiculously small, compared both with the amount of code in the pass, and with the amount of code that would have to added for the pass to update debug info mappings as it performs its pass-specific transformations. It might be possible to cover some of these updates by generic code, but it's precisely in the non-standard transformations that they'd require additional code. Simply letting them apply their work to the debug stuff proved to be quite a successful approach, as I hope anyone who bothers to look at the patches will verify. After the binding points are carried and updated throughout optimizations and IR conversions, we arrive at the var-tracking pass, where we used to turn register and memory attributes into var_location annotations. It is here that VTA does more of its magic. Using something that vaguely resembles global value numbering, but without the benefits of SSA, we propagate the bindings and analyze loads, stores, copies and computations, so that we can determine where all copies of the value of each variable are, so that, if one location is modified, we can still use another to refer to it in debug information. At control flow confluences, we merge the known locations, known values, computing expressions, etc, as expected. This is where some work is still required: although we merge stuff in registers perfectly, we still don't deal with stack slots properly. Sometimes they work, but mostly by chance. It is the lack of this feature that makes VTA debug information not uniformly superior to current debug information at this point. This feature is next in my to-do list, it shouldn't take long, but I wanted to post the bulk of the changes before the GCC Summit, so that you get a chance to discuss it there. Unfortunately, I won't be there; by the time budget for my attendance became available, I was already committed to participating and organizing several other events later this month. Anyhow, since VTA is still missing at least one essential feature, it shouldn't be enabled by default even if it goes into the trunk. It would be nice, however, to have it in, so that people can start testing it out, verifying that it imposes essentially zero overhead when debug information (or VTA itself) are not enabled, and that, when VTA is enabled, the increase in memory use and compile time are tolerable. Compile time overhead in the var tracking pass was pretty bad as recently as a couple of months ago, but I managed to bring it down to something that varies between negligible and not too bad, except for some hopefully pathological and fixable cases I'm yet to look into. HTML_401F in libjava appears to be *the* worst-case scenario. Once that is taken care of, performance- and memory-related bug reports will be useful. Furthermore, if VTA goes in, but disabled by default, people can start testing it on platforms I can't easily test on, and letting me know about any problems introduced by VTA: hopefully none when it's disabled, possibly some when it's enabled (say, I haven't tested it on any machine with delayed branch slots yet), quite likely some when -fcompare-debug or bootstrap-debug are in use. I know there are some recent -fcompare-debug regressions on IA64 with VTA enabled, that I haven't got myself to fix yet, and several others in C++ and even in C with -O3 -g (mentioned above) that show up without VTA. I'd very much appreciate any other such reports, and I'm committed to addressing them as quickly as possible, with the caveat that I won't be around for most of the second half of June (one more reason to keep VTA disabled by default at first). == Submission plan I've talked to a number of people about how to submit the patch. There was consensus that posting it as a single huge patch wouldn't fly. OTOH, turning it into a series of dozens of small patches that would have to be tested so that they could be applied incrementally would be an inordinate amount of work. An approach that everyone I talked to found acceptable was to first clear the VTA-independent stuff out of the way (which I started early this week, and that is now nearly completed), then break up the actual VTA changes into conceptual components, which would ease review, but that, save for exceptions, would still be applied as a unit. I broke it up into the following patches, that I'm going to submit soon to gcc-patches: cmdline (7K) - new command line flags to turn VTA on or off, as well as a few debugging options that helped me debug it ssa (55K) - introduce debug bind stmts in the tree and tuples level ssa-to-rtl (24K) - convert debug bind stmts to debug insns rtl (48K) - introduce debug insns in the RTL level tracking (176K) - turn debug insns into var_location notes ssa-compare-debug (22K) - fix -fcompare-debug errors that showed up in the presence of debug bind stmts rtl-compare-debug (53K) - fix -fcompare-debug errors that showed up in the presence of debug insns sched (63K) - fix schedulers (except for sel-sched, that's only partially fixed, which means VTA is not ready for -O3 on IA64) to deal properly with debug insns ports (9K) - minor adjustments to ports, mostly to schedulers, to avoid -fcompare-debug regressions testsuite-guality (16K) - (still small) debug info quality testsuite buildopts (4K) - new BUILD_CONFIG options that can test VTA more thoroughly I realize the division is quite uneven, but I hope this will do. Most of the changes in the compare-debug patches are not interdependent and could be broken up into smaller patches, and even go in after the rest. The same is probably true for the last four as well, but the first 5 pretty much have to go in as a unit. I haven't fished the ChangeLog entries from the VTA branch. The patches I'm going to post don't have ChangeLog entries at all. I suppose the purpose is clear (add VTA), but rather than just taking the incremental changes to the VTA branch, I'd write a consolidated ChangeLog entry. But if I did this, I wouldn't be able to post these patches tonight (oops, it's morning already ;-), and then you probably wouldn't get to see them before the Summit. So, please bear with the lack of ChangeLogs and, if you feel a need to understand some particular change without asking me, all the patches along with their rationales were posted to gcc-patches before, but perhaps ChangeLog.vta might be enough to clear it up: http://gcc.gnu.org/svn/gcc/branches/var-tracking-assignments-branch/gcc/ChangeLog.vta For those of you attending the Summit, have a great one. -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer