[Lldb-commits] [PATCH] D119963: [LLDB] Dump valid ranges of variables

Zequan Wu via Phabricator via lldb-commits Wed, 23 Feb 2022 11:25:30 -0800

zequanwu added inline comments.


================
Comment at: lldb/test/Shell/SymbolFile/DWARF/x86/debug_loc.s:28-29
 # CHECK:     Variable{{.*}}, name = "x0", {{.*}}, scope = parameter, location =
 # CHECK-NEXT:  [0x0000000000000000, 0x0000000000000001): DW_OP_reg5 RDI
 # CHECK-NEXT:  [0x0000000000000001, 0x0000000000000006): DW_OP_reg0 RAX
 # CHECK:     Variable{{.*}}, name = "x1", {{.*}}, scope = parameter
----------------
labath wrote:
> zequanwu wrote:
> > labath wrote:
> > > zequanwu wrote:
> > > > labath wrote:
> > > > > zequanwu wrote:
> > > > > > labath wrote:
> > > > > > > zequanwu wrote:
> > > > > > > > `image dump symfile` already prints valid ranges for variables 
> > > > > > > > along with where the value is at each range.
> > > > > > > Are you sure it does?
> > > > > > > 
> > > > > > > I was under the impression that there are two distinct range 
> > > > > > > concepts being combined here. One is the range list member of the 
> > > > > > > Variable object (as given by `GetScopeRange` -- that's the one 
> > > > > > > you're printing now), and the other is the list of ranges hidden 
> > > > > > > in the DWARFExpression object, which come from the 
> > > > > > > debug_loc(lists) section (that's the one we've been printing so 
> > > > > > > far). And that the root cause of the confusion is the very 
> > > > > > > existence of these two concepts.
> > > > > > > 
> > > > > > > If I got it wrong, then do let me know, cause it would make 
> > > > > > > things a lot simpler if there is only one validity concept to 
> > > > > > > think about.
> > > > > > Dwarf plugin is supposed to construct the `m_scope_range` member of 
> > > > > > an Variable, but it doesn't. `scope_ranges` is empty at 
> > > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L3468.
> > > > > >  
> > > > > > `image dump symfile` dumps the dwarf location list in `m_location` 
> > > > > > in `Variable`. 
> > > > > > The dwarf location list has more information than `m_scope_range` 
> > > > > > as it contains info about where the value is during each range. 
> > > > > > (e.g. which register the variable lives in). 
> > > > > > 
> > > > > > So, I think we need to use similar logic to construct 
> > > > > > `m_scope_range` when creating `Variable` in dwarf plugin like this 
> > > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Expression/DWARFExpression.cpp#L145.
> > > > > Ok, I see where you're coming from. You're essentially saying that 
> > > > > the fact that the dwarf plugin does not fill this out is a bug.
> > > > > 
> > > > > I don't think that's the case. My interpretation was (and [[ 
> > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Symbol/Variable.cpp#L313
> > > > >  | this comment]] confirms it) that an empty range here means the 
> > > > > entire enclosing block. (Also, DWARF was for a long time the only 
> > > > > symbol file plugin, so what it does is kinda "correct by definition").
> > > > > 
> > > > > I don't think we want to change that interpretation, as forcing a 
> > > > > copy of the range in the location list would be wasteful (it would be 
> > > > > different if this was an interface that one could query, and that the 
> > > > > dwarf plugin could implement by consulting the location list). 
> > > > > However, since the dwarf class does not actually make use of this 
> > > > > functionality (it was [[ https://reviews.llvm.org/D17449 | added ]] 
> > > > > to support DW_AT_start_scope, then broken at some point, and 
> > > > > eventually [[ https://reviews.llvm.org/D62302 | removed ]]), we do 
> > > > > have some freedom in defining the interactions of the two fields (if 
> > > > > you still want to pursue this, that is).
> > > > > 
> > > > > So how about this: if the user passes the extra flag, then we print 
> > > > > both the range field (if it exists), and the *full* location list (in 
> > > > > that order, ideally). That way the output will be either `range = [a, 
> > > > > b), [c, d), location = DW_OP_reg47` or `location = [a,b) -> 
> > > > > DW_OP_reg4, [c,d) -> DW_OP_reg7`. If the dwarf plugin starts using 
> > > > > the range field again then the output will contain both fields, which 
> > > > > will be slightly confusing, but at least not misleading (and we can 
> > > > > also change the format then).
> > > > Oh, I think I misunderstood `m_scope_range`. It's the range list where 
> > > > the variable is valid regardless whether its value is accessible or not 
> > > > (valid range). As for `m_location` in `Variable`, it's describing the 
> > > > ranges where the value is (value range). They are not the same. 
> > > > 
> > > > So, currently how NativePDB creates local Variable's range is not 
> > > > correct. That only works when it's not optimized build such that the 
> > > > valid range is the same as the value range. It still need to create 
> > > > dwarf location lists to correctly represent the value range, but as 
> > > > mentioned [[ https://reviews.llvm.org/D119508#3319113 | here ]], we 
> > > > need to choose a generic "variable location provider" interface for 
> > > > that.
> > > > 
> > > > Oh, I think I misunderstood m_scope_range. It's the range list where 
> > > > the variable is valid regardless whether its value is accessible or not 
> > > > (valid range). As for m_location in Variable, it's describing the 
> > > > ranges where the value is (value range).
> > > 
> > > Yes, that was my initial assumption as well, and I think that is the only 
> > > interpretation in which it makes sense to have two sources of range 
> > > information for a variable. However, I've done some research since then, 
> > > and I haven't found any compiler or debugger which would model the 
> > > program sufficiently precisely to be able to make that distinction.
> > > 
> > > There are definite limits as to how far you can go with pdb using these 
> > > abstractions, but given they (the m_scope_range) exist, I think you could 
> > > make use of them (as you've done now), if they are sufficient for your 
> > > current use case. That said, I would definitely encourage you to create a 
> > > better abstraction for providing the location information for a variable.
> > I think you meant to replace `DWARFExpression` with a more generic 
> > interface which has the same functionalities as `DWARFExpression`. That 
> > seems a lot work, especially on `DWARFExpression::Evaluate`. 
> Well.. not exactly "replace". You know how they say there's no software 
> engineering problem that can't be solved by adding a layer of indirection. 
> So, I thought we could create a new `VariableLocationProvider` (for lack of a 
> better name) interface, and one of the implementations of that interface 
> would be backed by a DWARFExpression class. Theoretically you may not need to 
> touch the DWARFExpression class at all -- just wrap it so that it conforms to 
> the new interface.
> 
> This is still pretty hand-wavy, so I don't know how much work would it be, 
> but it does not seem like it should be _that_ hard..
Thanks, I see what you mean. I thought it would be each symbol file plugin 
needs to convert their debug formats to a universal one and the actual 
evaluation will happens in that one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119963/new/

https://reviews.llvm.org/D119963

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D119963: [LLDB] Dump valid ranges of variables

Reply via email to