[Lldb-commits] [PATCH] D78801: [LLDB] Add class ProcessWasm for WebAssembly debugging

Paolo Severini via Phabricator via lldb-commits Fri, 01 May 2020 00:09:13 -0700

paolosev added a comment.

> The thing I am resisting is to put all of this stuff in the the 
> ProcessGDBRemote class. Instead of trying to generalize it so that it can 
> handle everything (and generalize to the wrong thing), I very much like the 
> idea of introducing a `WasmProcess` plugin class that handles all wasm stuff. 
> If that class happens to use parts of gdb-remote, then so be it, but it means 
> that it's not a problem for that class to use a dialect of the gdb-remote 
> protocol, which includes as many wasm-specific packets as it needs. Then this 
> class can create it's own implementation of the "Unwind" interface, which 
> will use some WasmProcess-specific apis to undwind, but that will also be ok, 
> since both classes will be wasm-specific.


I think that this would solve a lot of problems. `WasmProcess` could inherit 
from GDBRemoteProcess and could send itself Wasm-specific commands like 
qWasmMem just by calling `GetGDBRemote().SendPacketAndWaitForResponse()` Then 
there would be no changes to ProcessGDBRemote at all.
For walking the call stack I would then keep the current design with the 
Unwind-derived class that sends a command to get the call stack from the stub. 
It’s true that normally it is the client that calculates call stacks, but it 
does so even because it cannot really ask them to the stub, but here we have a 
runtime that can provide this information.

> If dwarf had a notion of address spaces then there'd probably be no need for 
> DW_OP_WASM_location. If the dwarf committee hasn't been able to come up with 
> a unified way to represent address spaces, then I'm not sure we will do 
> better. And we'll still be left with the job of translating dwarf (and other) 
> entries into this other concept.

Yes, it is possible to come up with some representation of a unified address 
space for Wasm locals, globals and stack items (and also code and memory). But 
it's also true that locals, globals and stack items don’t really have a memory 
location. The reason for the introduction of `DW_OP_WASM_location` is indeed 
because there was nothing in DWARF that could well represent these entities. 
While there are certainly other architectures with multiple memory spaces, the 
execution model of WebAssembly is quite peculiar in this aspect, I think.

> The question is whether something similar can be done for the other two 
> cases. I believe it might be possible for the DWARFExpression. We just need 
> to have a way to separate the creation of the dwarf expression data from the 
> process of evaluating it. Right now, both of these things happens in 
> SymbolFileDWARF -- it creates a DWARFExpression object which both holds the 
> data and knows how to evaluate it. But I don't believe SymbolFileDWARF needs 
> the second part. If we could make it so that something else is responsible 
> for creating the evaluator for dwarf expressions, that something could create 
> a WasmDWARFExpression which would know about DW_OP_WASM_location and 
> WasmProcess, and could evaluate it.

Separating the DWARFExpression data and the logic to evaluate it would 
certainly make sense. I think that this must be a general problem: since DWARF 
provides the way to define custom expression location, different architectures 
might define specific DWARF codes that they need to handle, so it would be 
great if we had a `DWARFEvaluator`  that was also pluggable. I don’t know how 
complex would be to refactor DWARFExpression in this way but I can investigate.

> The `GetValueAsData` problem is trickier, particularly as I'm not even sure 
> the current implementation is correct. Are you sure that you really need the 
> _current_ module there? What happens if I use "target variable" to display a 
> variable from a different module? What if I then dereference that variable? 
> If the dereference should happen in the context of the other module, then I 
> guess the "module" should be a property of the value, not of the current 
> execution context. And it sounds like some address space-y thing would help. 
> But that might require teaching a lot of places in lldb about address spaces, 
> in particular that a dereferencing a value in one address space, should 
> produce a value in the same address space (at least for "near" pointer or 
> something).
>  If we should really use the current frame as the context, then I guess we'd 
> need some sort of a interfaces to ask a stack frame to get the value of a 
> "Value".

The fact that the code address space is separated from the memory address space 
is really what makes things complicated. However, we know for sure that every 
time that all memory reads made while evaluating DWARF expressions or variables 
always target the module memory space, never the code space.

I must confess that had not really tested `target variable` so far; I did it 
today and I found that it already //almost// works. What happens is that the 
location of the global variable is calculated with DWARFExpression::Evaluate 
(in my tests I only see `DW_OP_addr`, but maybe there could be other ways?) and 
there it calls Value::ConvertToLoadAddress() passing the correct module, and 
this produces an address in the form `module_id|offset`, which is then used in 
Value::GetValueAsData(), which currently sends qWasmMem requests.

We don’t really need the frame_index in Value:: GetValueAsData; the stub only 
uses frame_index to calculate the corresponding module_id, and we already pass 
the current Module* to this function.
So it seems possible to make this work for WebAssembly, but the tricky part is 
how to make Wasm-specific reads in a class like Value that should not have any 
knowledge about Wasm.

>> Actually, that brings up an idea: a process runtime plug-in for languages 
>> that have a runtime that can be communicated with. Java has a runtime, Wasm 
>> has a runtime, most Javascript engines have runtimes. Maybe this wouldn't be 
>> too hard to abstract? Then we can add the GetGlobal(index), GetLocal(index), 
>> GetStack(index) to this plug-in?
> 
> I think that is a very interesting idea to explore. We already have language 
> runtime plugins, and they do have the ability to refine the type of a 
> variable/value. Maybe they could also somehow help with computing the value 
> of a variable? Then instead of a GetGlobal/Local(index), we could have 
> `GetValue(Variable)`, and they would be the ones responsible for making sense 
> of the dwarf expressions and address spaces?

Summarizing, if we assume that we can create:

- `WasmProcess`, derived from GDBRemoteProcess,
- `WasmUnwind`, derived from Unwind, and
- `WasmDWARFEvaluator`, derived from a new class DWARFEvaluator,

then we would not have to touch any existing GDBRemote- and DWARFExpression 
code.
At this point the way we query locals/globals/stack and the call stack from the 
Wasm engine would be just an implementation detail of these Wasm plugin classes 
(assuming that we can define new gdb-remote custom commands like `qWasmLocal`, 
`qWasmGlobal`, `qWasmStackValue`, `qWasmCallStack`. Then we would not really 
need to abstract this functionality with a generic interface, in my opinion.

What is left to decide would be “just” how to handle memory. In particular how 
to make Wasm-specific memory requests, targeting a particular Wasm module, from 
class `Value`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78801/new/

https://reviews.llvm.org/D78801



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D78801: [LLDB] Add class ProcessWasm for WebAssembly debugging

Reply via email to