clayborg added a comment. In D78801#2009917 <https://reviews.llvm.org/D78801#2009917>, @labath wrote:
> In D78801#2009048 <https://reviews.llvm.org/D78801#2009048>, @clayborg wrote: > > > In D78801#2007795 <https://reviews.llvm.org/D78801#2007795>, @labath wrote: > > > > > While that idea has occurred to me too, I am not convinced it is a good > > > one: > > > > > > - it replaces one odd dependency with another one. Why should a Process > > > need to know how to evaluate a DWARF expression? Or even that DWARF > > > exists for that matter? This seems totally unrelated to what other > > > Process functions are doing currently... > > > > > > But it is what people do in reality. DW_OP_low_user and DW_OP_high_user are > > ranges that are made available to people to customize their DWARF opcodes. > > If you don't handle it, you are hosed and can't show a variable location. > > And to make things worse, two different compilers could both use the same > > value in that range. So they made DWARF expressions customizable with no > > real attempt to make them function for different architectures. that is > > unless you standardize it and make a real opcode that gets accepted into > > DWARF. The kind of DWARF location opcode that is being used here could > > easily be generalized into a DW_OP_get_stack_variable with a bunch of args, > > but at some point you have to talk to someone that is in communication with > > the runtime of the thing you are debugging to get the answer. So I do > > believe asking the process for this is not out of scope. > > > I think the "at some point" part is very important here. I get how dwarf > expressions are meant to be extended, and that doing that is tricky, but I > don't think that automatically means that we should delegate that to a > different process. There are various ways that could be implemented and the > delegation could be performed at different levels. For example the > DWARFExpression class could be made into a class hierarchy and we could have > a subclass of it for each architecture with funny operations. Then the > subclass would have enough knowledge about wasm to properly parse the > expression and evaluate it (possibly by making additional queries to someone > else) -- this is sort of what the current patch does, without the "subclass" > part. > > The problem I have with a function like `EvaluateCustomDWARFExpressionOpcode` > is that it is completely unlike anything else that our process class needs to > deal with. The Process deals with threads (and how to control them), memory > (read/write) and, to a limited degree, with modules. It knows nothing about > "stack frames" or "variables" or "dwarf expressions" -- these are concepts > built on top of that. This becomes even more true if we start to talk about > the gdb-remote protocol instead of the lldb Process abstraction. > > > > > > >> - I am not sure it even completely removes wasm knowledge from e.g. > >> DWARFExpression -- the class would presumably still need to know how to > >> parse this opcode. > > > > It is true and this is another hole in the "people can extend DWARF easily" > > scenario. We need to know what opcode arguments are and that would need to > > be hard coded for now. But it wouldn't have to rely on anything except > > virtual function on the generic lldb_private::Process/Thread APIs. In this > > case as soon as we get an unknown opcode we would need to pass the > > DataExtractor and the offset into it so the process could extract the > > arguments. > > Not only that, we might need to pass in the entire DWARF stack, in case the > opcode depends on some of the stack arguments. Yes > > >> Not clean, but better than making DWARFExpression depend on process plug-ins >> IMHO. > > The dependence on a process plugin could be dealt with by making > `GetWasmGlobal/Local/etc` a virtual function on the Process class. Also not > clean, but it's not clear to me whether it's cleaner than having > `EvaluateCustomDWARFExpressionOpcode` virtual function. I would really like to see a solution that does not include "Wasm" in any process or thread virtual functions. I am fine with something like: enum IndexedVariableType { Global, Local, Stack }; Process::GetIndexedVariable(IndexedVariableType type, size_t index, ...) I would rather see a DWARF function added to process over a WASM specific one. > Anyway, just the fact that we can't come up with a "clean" solution doesn't > mean that we should accept an "unclean" one. This wouldn't be the first > feature that ends up sitting in a fork somewhere because it does not > integrate cleanly with llvm (probably the largest example of that is > swift-lldb). I think there is a clean way to do memory reading and writing with segment IDs and also to access variables from the process runtime. Actually, that brings up an idea: a process runtime plug-in for languages that have a runtime that can be communicated with. Java has a runtime, Wasm has a runtime, most Javascript engines have runtimes. Maybe this wouldn't be too hard to abstract? Then we can add the GetGlobal(index), GetLocal(index), GetStack(index) to this plug-in? > And I believe the current problems are just the tip of the iceberg. I can't > imagine what hoops we'll need to jump through once we start evaluating > expressions... > >> >> >>> - the interface could get very complicated if we wanted to implement typed >>> stacks present in DWARF5 -- presumably the API would need to return the >>> type of the result, in addition to its value. >> >> DWARF5 just further clarifies what each value on the opcode stack is (file >> address, load address, the value itself, etc). Right now DWARF expression >> just infer what a value is based on the opcodes. So I don't see a huge >> problem here as anything we do will need to work with DWARF5. > > That doesn't sounds right. DWARF5 introduces opcodes like `DW_OP_deref_type` > which takes a _die offset_ as an argument so that you can specify what is the > type of the dereference result value you are accessing. Fortunately that > offset must refer to a DW_TAG_base_type, which means the most interesting > aspects are byte size and signedness (and byte size is sort of implied by the > result), but that still leaves the door open to more languages with more > complicated "base" types. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78801/new/ https://reviews.llvm.org/D78801 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits