On 12/18/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote: > Then, we let tree optimizers do their jobs. Whenever they rename, > renumber, coalesce, combine or otherwise optimize a variable, they > will automatically update debug statements that mention them as well. > Speaking only about the tree level, in this entire email I make no representations about the RTL level ;)
This is much harder than you give it credit for, unless you plan on throwing out all the info at elimination points. Consider PRE alone, which makes new statements that are combinations of old ones, and eliminate tons of variables in favor of it. If your debug statement strategy is "move debug statements when we insert code that is equivalent", it won't work, because our equivalence is based on value equivalence, not location equivalence. We only guarantee it has the same value as the whatever it is a copy of at that point, not that it has the same location. So you will lose info every time PRE makes an insertion, unless you make serious modifications to PRE. This is not to mention the data you lose if you just throw it away at elimination points. Let's take another problem. How do i say debug info for some variable is now dead, we have no idea what it is right now? How do I figure out which debug statements need to be modified when you introduce new memory operations? When you pass something by address, you get vops. The vops are not variables, and have no relation to the original variable (they can be partitions containing more vairables). If i have DEBUG(x, x_3) x_3 = x; // Read from global y = x_3; .... If i insert a new call DEBUG(x, x_3): 1 x_3 = x foo() // May modify x and *&x) y = x_3 Now you have two problems. It is no longer true that at the point of y = x_3, that DEBUG (x, x_3) is true In act, x_3 may no longer have any relation to x. You have three choices: 1. Either destroy the DEBUG(x, x_3) losing valuable and correct info 2. Add a new DEBUG (x, unknown) 3. Figure out which debug statement are reached by your call #3 is a dataflow problem, and not something you want to do every time you insert a call. If your answer is #1 or #2, then what you are really doing is computing roughly the same dataflow problem var-location does, except on trees and with a different meet-operation. var-location generates incorrect info not because it represents something fundamentally different than you are (it doesn't), it falls down because it uses union as the meet operation. It says "oh, i don't know which of these locations is right, it must be both of them". If you changed the meet operation to "oh, i don't know which of these locations is right, it must be none of them", and did a little more work you would inference the same info as yours *at the tree level* Nothing you have proposed is fundamentally going to give you better info. All you have done is annotated the IR in some places to make explicit some bits in the dataflow problem that you could inference anyway. It is provable you can inference them with a simple lattice and associated value, *unless you are going to start guessing* (which you have said you don't want to do because it can generate incorrect info). There is absolutely no reason what you are trying to do needs to modify the tree IR at all to achieve exactly the same accuracy of debug info as your design proposes at the tree level. You could simply compute the global dataflow problem. The RTL level is harder, of course.