zequanwu added inline comments.
================
Comment at: lldb/test/Shell/SymbolFile/DWARF/x86/debug_loc.s:28-29
# CHECK: Variable{{.*}}, name = "x0", {{.*}}, scope = parameter, location =
# CHECK-NEXT: [0x0000000000000000, 0x0000000000000001): DW_OP_reg5 RDI
# CHECK-NEXT: [0x0000000000000001, 0x0000000000000006): DW_OP_reg0 RAX
# CHECK: Variable{{.*}}, name = "x1", {{.*}}, scope = parameter
----------------
labath wrote:
> zequanwu wrote:
> > labath wrote:
> > > zequanwu wrote:
> > > > labath wrote:
> > > > > zequanwu wrote:
> > > > > > labath wrote:
> > > > > > > zequanwu wrote:
> > > > > > > > `image dump symfile` already prints valid ranges for variables
> > > > > > > > along with where the value is at each range.
> > > > > > > Are you sure it does?
> > > > > > >
> > > > > > > I was under the impression that there are two distinct range
> > > > > > > concepts being combined here. One is the range list member of the
> > > > > > > Variable object (as given by `GetScopeRange` -- that's the one
> > > > > > > you're printing now), and the other is the list of ranges hidden
> > > > > > > in the DWARFExpression object, which come from the
> > > > > > > debug_loc(lists) section (that's the one we've been printing so
> > > > > > > far). And that the root cause of the confusion is the very
> > > > > > > existence of these two concepts.
> > > > > > >
> > > > > > > If I got it wrong, then do let me know, cause it would make
> > > > > > > things a lot simpler if there is only one validity concept to
> > > > > > > think about.
> > > > > > Dwarf plugin is supposed to construct the `m_scope_range` member of
> > > > > > an Variable, but it doesn't. `scope_ranges` is empty at
> > > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L3468.
> > > > > >
> > > > > > `image dump symfile` dumps the dwarf location list in `m_location`
> > > > > > in `Variable`.
> > > > > > The dwarf location list has more information than `m_scope_range`
> > > > > > as it contains info about where the value is during each range.
> > > > > > (e.g. which register the variable lives in).
> > > > > >
> > > > > > So, I think we need to use similar logic to construct
> > > > > > `m_scope_range` when creating `Variable` in dwarf plugin like this
> > > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Expression/DWARFExpression.cpp#L145.
> > > > > Ok, I see where you're coming from. You're essentially saying that
> > > > > the fact that the dwarf plugin does not fill this out is a bug.
> > > > >
> > > > > I don't think that's the case. My interpretation was (and [[
> > > > > https://github.com/llvm/llvm-project/blob/main/lldb/source/Symbol/Variable.cpp#L313
> > > > > | this comment]] confirms it) that an empty range here means the
> > > > > entire enclosing block. (Also, DWARF was for a long time the only
> > > > > symbol file plugin, so what it does is kinda "correct by definition").
> > > > >
> > > > > I don't think we want to change that interpretation, as forcing a
> > > > > copy of the range in the location list would be wasteful (it would be
> > > > > different if this was an interface that one could query, and that the
> > > > > dwarf plugin could implement by consulting the location list).
> > > > > However, since the dwarf class does not actually make use of this
> > > > > functionality (it was [[ https://reviews.llvm.org/D17449 | added ]]
> > > > > to support DW_AT_start_scope, then broken at some point, and
> > > > > eventually [[ https://reviews.llvm.org/D62302 | removed ]]), we do
> > > > > have some freedom in defining the interactions of the two fields (if
> > > > > you still want to pursue this, that is).
> > > > >
> > > > > So how about this: if the user passes the extra flag, then we print
> > > > > both the range field (if it exists), and the *full* location list (in
> > > > > that order, ideally). That way the output will be either `range = [a,
> > > > > b), [c, d), location = DW_OP_reg47` or `location = [a,b) ->
> > > > > DW_OP_reg4, [c,d) -> DW_OP_reg7`. If the dwarf plugin starts using
> > > > > the range field again then the output will contain both fields, which
> > > > > will be slightly confusing, but at least not misleading (and we can
> > > > > also change the format then).
> > > > Oh, I think I misunderstood `m_scope_range`. It's the range list where
> > > > the variable is valid regardless whether its value is accessible or not
> > > > (valid range). As for `m_location` in `Variable`, it's describing the
> > > > ranges where the value is (value range). They are not the same.
> > > >
> > > > So, currently how NativePDB creates local Variable's range is not
> > > > correct. That only works when it's not optimized build such that the
> > > > valid range is the same as the value range. It still need to create
> > > > dwarf location lists to correctly represent the value range, but as
> > > > mentioned [[ https://reviews.llvm.org/D119508#3319113 | here ]], we
> > > > need to choose a generic "variable location provider" interface for
> > > > that.
> > > >
> > > > Oh, I think I misunderstood m_scope_range. It's the range list where
> > > > the variable is valid regardless whether its value is accessible or not
> > > > (valid range). As for m_location in Variable, it's describing the
> > > > ranges where the value is (value range).
> > >
> > > Yes, that was my initial assumption as well, and I think that is the only
> > > interpretation in which it makes sense to have two sources of range
> > > information for a variable. However, I've done some research since then,
> > > and I haven't found any compiler or debugger which would model the
> > > program sufficiently precisely to be able to make that distinction.
> > >
> > > There are definite limits as to how far you can go with pdb using these
> > > abstractions, but given they (the m_scope_range) exist, I think you could
> > > make use of them (as you've done now), if they are sufficient for your
> > > current use case. That said, I would definitely encourage you to create a
> > > better abstraction for providing the location information for a variable.
> > I think you meant to replace `DWARFExpression` with a more generic
> > interface which has the same functionalities as `DWARFExpression`. That
> > seems a lot work, especially on `DWARFExpression::Evaluate`.
> Well.. not exactly "replace". You know how they say there's no software
> engineering problem that can't be solved by adding a layer of indirection.
> So, I thought we could create a new `VariableLocationProvider` (for lack of a
> better name) interface, and one of the implementations of that interface
> would be backed by a DWARFExpression class. Theoretically you may not need to
> touch the DWARFExpression class at all -- just wrap it so that it conforms to
> the new interface.
>
> This is still pretty hand-wavy, so I don't know how much work would it be,
> but it does not seem like it should be _that_ hard..
Thanks, I see what you mean. I thought it would be each symbol file plugin
needs to convert their debug formats to a universal one and the actual
evaluation will happens in that one.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119963/new/
https://reviews.llvm.org/D119963
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits