On Dec 18, 2007, "Daniel Berlin" <[EMAIL PROTECTED]> wrote:
> Consider PRE alone, > If your debug statement strategy is "move debug statements when we > insert code that is equivalent" Move? Debug statements don't move, in general. I'm not sure what you have in mind, but I sense some disconnect here. > because our equivalence is based on value equivalence, not location > equivalence. We only guarantee it has the same value as the > whatever it is a copy of at that point, not that it has the same > location. This sounds perfect to me. I'm concerned about values. Locations are an implementation detail. The thing to keep in mind is that what was originally a single user variable may end up mangloptimized into multiple stack slots, registers, with multiple simultaneously-live versions. Trying to pretend that any of these represent the user variable sounds like a recipe for madness to me. So I focus on values instead, and then on trying to recover locations based on binding and sharing of values. > How do i say debug info for some variable is now dead, we have no idea > what it is right now? For annotations, look for VAR_DEBUG_VALUE_NOVALUE in tree.h and VAR_LOC_UNKNOWN_P in rtl.h, in the VTA branch. For dwarf location lists, you just refrain from emitting locations for a given range. > How do I figure out which debug statements need to be modified when > you introduce new memory operations? None. By definition, debug annotations are only about variables that are not addressable. Those that are are fixed at a single location, so there's no reason to track them in a fancy way. > If i insert a new call > DEBUG(x, x_3): 1 > x_3 = x > foo() // May modify x and *&x) > y = x_3 > Now you have two problems. You're talking about a real problem, but your example is misguided. Let me give you a real problem scenario. (set (reg <T>) (<whatever>)) (var_location x (reg <T>)) (set (mem <addr>) (reg <T>)) (set (reg <T>) (<somethingelse>)) (call (mem (symbol_ref foo))) So, at the var_location debug_insn, we know that x is in reg <T>. That's stored at *addr, so now we might be able to use it as an additional location for x. And then, when reg is modified, we remove T from the equivalence class, and then only location holding the value of x is *addr. Then, a function call, that might modify *addr. So, do we decide that x is no longer available after the call, or do we hope *addr still represents it? The thing to remember is that the annotations are only about gimple regs. This means calls don't modify them, ever. But we still have to decide whether *addr represents x or not. My thoughts are leaning towards looking at the memory address or other memory attributes to tell whether it's an addressable stack slot or not. If it's addressable, remove it from the equivalence class at the call, so the equivalence class becomes empty, and the variable is regarded as dead. If it's not addressable (a pseudo assigned to memory), then we can keep it, even if x is actually dead past the call. What we'll see is that, if x is not dead after the call, the compiler will arrange to preserve its value in one such local non-addressable stack slot, and it will probably extend the equivalence class again after the call, as the pseudo is restored. Or the pseudo will be temporarily assigned to a call-saved register, which, for being call-saved, won't be removed from equivalence classes at call instructions. Whereas, if x is dead and its value was just copied to some random memory location, then we may as well flag it as dead at the call site, where the memory location may be modified. So, it all works out nicely, because we know we're only dealing with gimple regs. volatile asms make this slightly trickier, because they're totally unpredictable. I'm thinking it's safe to simply remove addressable memory locations from equivalence classes at them, just for safety, but I don't have it completely figured out. > #3 is a dataflow problem, and not something you want to do every time > you insert a call. I'm not sure what you mean by "inserting calls". We don't do that. Calls are present in the source code (even when implied by stuff like TLS, OpenMP or builtins such as memcpy), and they're either kept around, eliminated or inlined. (disgression intended to be funny: this "inserting a call" discussion reminds me of those impossible initial conditions in electromagnetism textbook exercises, such as uniform magnetic fields in which charged particle suddenly appear ;-) > If your answer is #1 or #2, then what you are really doing is > computing roughly the same dataflow problem var-location does, except > on trees and with a different meet-operation. I am actually computing the same dataflow problem of var-tracking. That's the whole point. But I'm giving it more information, to enable it to track more variables. And it needs to deal with multiple concurrent locations for the same variable, and multiple variables in the same locations, which are "slight" complications. But you're right, in the end it's the same problem. But I'm not computing that in trees. I'm just collecting and maintaining data points for var-tracking, all the way from the tree level. > var-location generates incorrect info not because it represents > something fundamentally different than you are (it doesn't), it falls > down because it uses union as the meet operation. > It says "oh, i don't know which of these locations is right, it must > be both of them". However, it can't deal with parallel locations, so this is at odds with your statement. I haven't got 'round to studying the exact dataflow algorithm var-tracking uses, I just figured I needed to do something along these lines. Maybe it does need tweaking, if I end up using it. I'm not sure yet it's going to make sense to use it for the more detailed tracking of copying that I'm going to have to do. > If you changed the meet operation to "oh, i don't know which of these > locations is right, it must be none of them", and did a little more > work you would inference the same info as yours *at the tree level* Intersection sounds like the right approach to me. I assumed var-tracking did this, except for unknowns. It's a bit trickier than this because var-tracking has to deal with a lot of incomplete information. But at least for vta values, we are going to have a complete picture, so we can be stricter when it comes to gimple reg variables. Now, whether the fact that we could infer the very same values at the tree level is relevant, I don't know. The tree level is neither source level nor the final executable code, so unless we can establish useful mappings from the tree level to both source level and final executable code, this information is of little use, no matter how true it is. > Nothing you have proposed is fundamentally going to give you better info. Except for what tree transformations currently discard, such as the points of the program in which variables are bound to values. This is indeed the one of the elements that the annotations are trying to preserve, that the compiler has not cared about preserving. (The other being expressions that end up not computed at run time, but that could still be computed by a debugger based on state available elsewhere) > All you have done is annotated the IR in some places to make explicit > some bits in the dataflow problem that you could inference anyway. Now, this is not true. I could infer values, yes, but I couldn't infer the variables they relate to, nor the point of binding. And debug information is not just about the values, it's about mapping variables to values and locations. So, we can't infer all the information we need. > There is absolutely no reason what you are trying to do needs to > modify the tree IR at all to achieve exactly the same accuracy of > debug info as your design proposes at the tree level. So far these claims have been unconvincing. I still get the feeling that you're missing some aspects of the problem, but I invite you to show me how the information available in the current IR could be used to generate accurate debug information for the two examples in the design document. Even if we leave the RTL aspect of it aside for a moment. I certainly wouldn't mind having to generate annotations only when we move from Trees to RTL, but I can't imagine how we'd reintroduce bindings at points that are not marked in the tree level, for variables that are (partially or entirely) gone from the tree IR. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}