[Lldb-commits] [PATCH] D78801: [LLDB] Add class ProcessWasm for WebAssembly debugging

Greg Clayton via Phabricator via lldb-commits Wed, 29 Apr 2020 16:46:16 -0700

clayborg added a comment.

In D78801#2009917 <https://reviews.llvm.org/D78801#2009917>, @labath wrote:


> In D78801#2009048 <https://reviews.llvm.org/D78801#2009048>, @clayborg wrote:
>
> > In D78801#2007795 <https://reviews.llvm.org/D78801#2007795>, @labath wrote:
> >
> > > While that idea has occurred to me too, I am not convinced it is a good 
> > > one:
> > >
> > > - it replaces one odd dependency with another one. Why should a Process 
> > > need to know how to evaluate a DWARF expression? Or even that DWARF 
> > > exists for that matter? This seems totally unrelated to what other 
> > > Process functions are doing currently...
> >
> >
> > But it is what people do in reality. DW_OP_low_user and DW_OP_high_user are 
> > ranges that are made available to people to customize their DWARF opcodes. 
> > If you don't handle it, you are hosed and can't show a variable location. 
> > And to make things worse, two different compilers could both use the same 
> > value in that range. So they made DWARF expressions customizable with no 
> > real attempt to make them function for different architectures. that is 
> > unless you standardize it and make a real opcode that gets accepted into 
> > DWARF. The kind of DWARF location opcode that is being used here could 
> > easily be generalized into a DW_OP_get_stack_variable with a bunch of args, 
> > but at some point you have to talk to someone that is in communication with 
> > the runtime of the thing you are debugging to get the answer. So I do 
> > believe asking the process for this is not out of scope.
>
>
> I think the "at some point" part is very important here. I get how dwarf 
> expressions are meant to be extended, and that doing that is tricky, but I 
> don't think that automatically means that we should delegate that to a 
> different process. There are various ways that could be implemented and the 
> delegation could be performed at different levels. For example the 
> DWARFExpression class could be made into a class hierarchy and we could have 
> a subclass of it for each architecture with funny operations. Then the 
> subclass would have enough knowledge about wasm to properly parse the 
> expression and evaluate it (possibly by making additional queries to someone 
> else) -- this is sort of what the current patch does, without the "subclass" 
> part.
>
> The problem I have with a function like `EvaluateCustomDWARFExpressionOpcode` 
> is that it is completely unlike anything else that our process class needs to 
> deal with. The Process deals with threads (and how to control them), memory 
> (read/write) and, to a limited degree, with modules. It knows nothing about 
> "stack frames" or "variables" or "dwarf expressions" -- these are concepts 
> built on top of that. This becomes even more true if we start to talk about 
> the gdb-remote protocol instead of the lldb Process abstraction.
>
> > 
> > 
> >> - I am not sure it even completely removes wasm knowledge from e.g. 
> >> DWARFExpression -- the class would presumably still need to know how to 
> >> parse this opcode.
> > 
> > It is true and this is another hole in the "people can extend DWARF easily" 
> > scenario. We need to know what opcode arguments are and that would need to 
> > be hard coded for now. But it wouldn't have to rely on anything except 
> > virtual function on the generic lldb_private::Process/Thread APIs. In this 
> > case as soon as we get an unknown opcode we would need to pass the 
> > DataExtractor and the offset into it so the process could extract the 
> > arguments.
>
> Not only that, we might need to pass in the entire DWARF stack, in case the 
> opcode depends on some of the stack arguments.


Yes

> 
> 
>> Not clean, but better than making DWARFExpression depend on process plug-ins 
>> IMHO.
> 
> The dependence on a process plugin could be dealt with by making 
> `GetWasmGlobal/Local/etc` a virtual function on the Process class. Also not 
> clean, but it's not clear to me whether it's cleaner than having 
> `EvaluateCustomDWARFExpressionOpcode` virtual function.

I would really like to see a solution that does not include "Wasm" in any 
process or thread virtual functions. I am fine with something like:

  enum IndexedVariableType {
   Global,
   Local,
   Stack
  };
  
  Process::GetIndexedVariable(IndexedVariableType type, size_t index, ...)

I would rather see a DWARF function added to process over a WASM specific one.

> Anyway, just the fact that we can't come up with a "clean" solution doesn't 
> mean that we should accept an "unclean" one. This wouldn't be the first 
> feature that ends up sitting in a fork somewhere because it does not 
> integrate cleanly with llvm (probably the largest example of that is 
> swift-lldb).

I think there is a clean way to do memory reading and writing with segment IDs 
and also to access variables from the process runtime.

Actually, that brings up an idea: a process runtime plug-in for languages that 
have a runtime that can be communicated with. Java has a runtime, Wasm has a 
runtime, most Javascript engines have runtimes. Maybe this wouldn't be too hard 
to abstract? Then we can add the GetGlobal(index), GetLocal(index), 
GetStack(index) to this plug-in?

> And I believe the current problems are just the tip of the iceberg. I can't 
> imagine what hoops we'll need to jump through once we start evaluating 
> expressions...
> 
>> 
>> 
>>> - the interface could get very complicated if we wanted to implement typed 
>>> stacks present in DWARF5 -- presumably the API would need to return the 
>>> type of the result, in addition to its value.
>> 
>> DWARF5 just further clarifies what each value on the opcode stack is (file 
>> address, load address, the value itself, etc). Right now DWARF expression 
>> just infer what a value is based on the opcodes. So I don't see a huge 
>> problem here as anything we do will need to work with DWARF5.
> 
> That doesn't sounds right. DWARF5 introduces opcodes like `DW_OP_deref_type` 
> which takes a _die offset_ as an argument so that you can specify what is the 
> type of the dereference result value you are accessing. Fortunately that 
> offset must refer to a DW_TAG_base_type, which means the most interesting 
> aspects are byte size and signedness (and byte size is sort of implied by the 
> result), but that still leaves the door open to more  languages with more 
> complicated "base" types.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78801/new/

https://reviews.llvm.org/D78801



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D78801: [LLDB] Add class ProcessWasm for WebAssembly debugging

Reply via email to