[Lldb-commits] [PATCH] D72751: [LLDB] Add DynamicLoaderWasmDYLD plugin for WebAssembly debugging

Greg Clayton via Phabricator via lldb-commits Mon, 27 Jan 2020 11:26:18 -0800

clayborg added a comment.

In D72751#1835656 <https://reviews.llvm.org/D72751#1835656>, @labath wrote:


> [ looping in @aadsm for the svr4 stuff ]
>
> Thanks for adding the test, and for the detailed writeup. Please find my 
> comments inline.
>
> In D72751#1835502 <https://reviews.llvm.org/D72751#1835502>, @paolosev wrote:
>
> > The first is that we need to call `Target::SetSectionLoadAddress()` twice, 
> > from two different places. First we need to call 
> > `Target::SetSectionLoadAddress()` in `ObjectFileWasm::SetLoadAddress()`, 
> > and then again in `DynamicLoaderWasmDYLD::DidAttach()`. The reason for this 
> > seems to originate in the sequence of function calls:
> >
> > In `DynamicLoaderWasmDYLD::DidAttach()` we call 
> > `ProcessGDBRemote::LoadModules()` to get list of loaded modules from the 
> > remote (Wasm engine).
> >  `ProcessGDBRemote::LoadModules()` calls, first:
> >
> > - `DynamicLoaderWasmDYLD::LoadModuleAtAddress()` and from there: ...
> >
> >   then:
> > - `Target::SetExecutableModule() -> Target::ClearModules() -> 
> > SectionLoadList::Clear()`
> >
> >   So, at the end of `LoadModules()` in `DynamicLoaderWasmDYLD::DidAttach()` 
> > the SectionLoadList is empty, and we need to set it again by calling 
> > `Target::.SetSectionLoadAddress()` again. This works but the duplication is 
> > ugly; is there a way to improve this?
>
>
> I hope so. :) This seems like a bug in `ProcessGDBRemote::LoadModules`. It 
> seems wrong/wasteful/etc to do all this work to compute the section load 
> addresses only to have them be thrown away by `SetExecutableModule`. Maybe 
> all it would take is to reverse the order of these two actions, so that the 
> load addresses persist? Can you try something like that?


I would be interested to see if this helps as well.

> On a side note, `ProcessGDBRemote::LoadModules` seems a bit internally 
> inconsistent. At one place it claims that "The main executable will never be 
> included in libraries-svr4", but then it goes on to set an executable module 
> anyway. This could in fact be a clue as to why this problem hasn't showed up 
> on other platforms -- if the remote does not send the executable, then 
> SetExecutableModule is not called.

The logic in ProcessGDBRemote::LoadModules is bad, the executable should be set 
before any sections are loaded.

> On a side-side note, this may mean that you sending the main wasm file 
> through `qXfer:libraries-svr4` may not be correct. However, fixing that would 
> mean finding another way to communicate the main executable name/address. I 
> don't think we currently have an easy way to do that so it may be better to 
> fix `ProcessGDBRemote::LoadModules`, given that it "almost" supports 
> executables.

Is there a "qXfer:executable"? Seems from the code in 
ProcessGDBRemote::LoadModules() that people were seeing the main executable in 
the list somewhere. Be a shame to not allow it.

> I am also worried about the fact that `SymbolFileDWARF::CalculateAbilities` 
> requires the module to be "loaded". That shouldn't be normally required. That 
> function does a very basic check on some sections, and this should work fine 
> without those sections having a "load address", even if they are actually 
> being loaded from target memory. I think this means there are some additional 
> places where ObjectFileWasm should use `m_memory_addr` instead of something 
> else...

So a lot of these problems might go away if the modify the ObjectFileWASM to 
"do the right thing" when the ObjectFile is from a live process. Object file 
instances know if there are from a live process because the ObjectFile members:

  lldb::ProcessWP m_process_wp;
  const lldb::addr_t m_memory_addr;

Will be filled in. So the object file doesn't need to have its sections loaded 
in a target in order to read section contents right? The object file can be 
asked to read its section data via ObjectFile::ReadSectionData(...) (2 
variants).

So Pavel is correct, the sections don't need to be loaded for anything in 
SymbolFileDWARF::CalculateAbilities(...). The section list does need to be 
created and available during SymbolFileDWARF::CalculateAbilities(...). We just 
need to be able to get the section contents and poke around at the DWARF bits.

> 
> 
>> The second problem is that the Code Section needs to be initialized (in 
>> `ObjectFileWasm::CreateSections()`) with `m_file_addr = m_file_offset = 0`, 
>> and not with the actual file offset of the Code section in the Wasm file. If 
>> we set `Section::m_file_addr` and `Section::m_file_offset` to the actual 
>> code offset, the DWARF info does not work correctly.
>> 
>> I have some doubts regarding the DWARF data generated by Clang for a Wasm 
>> target. Looking at an example, for a Wasm module that has the Code section 
>> at offset 0x57, I see this DWARF data:
>> 
>>   0x0000000b: DW_TAG_compile_unit
>>                 […]
>>                 DW_AT_low_pc (0x0000000000000000)
>>                 DW_AT_ranges (0x00000000
>>                    [0x00000002, 0x0000000e)
>>                    [0x0000000f, 0x0000001a)
>>                    [0x0000001b, 0x00000099)
>>                    [0x0000009b, 0x0000011c))
>> 
>> 
>> The documentation <https://yurydelendik.github.io/webassembly-dwarf/#pc> 
>> says that //“Wherever a code address is used in DWARF for WebAssembly, it 
>> must be the offset of an instruction relative within the Code section of the 
>> WebAssembly file.”// 
>>  But is this correct? Shouldn't maybe code addresses be offset-ed by the 
>> file address of the Code section?
> 
> That's interesting. I don't think that clang is really wrong/non-conforming 
> here, but this choice plays a rather poorly with the way lldb handles object 
> files (and how your remote presents them). In this case special casing the 
> code section may be fine, but you should be aware that there are places in 
> lldb which expect that the "load bias" (the delta between file and load 
> addresses) is the same for all sections in a module (because that's how it 
> works elsewhere), and they may not like this. However, if you have only one 
> code section, I think you should be mostly fine.
> 
> I do think that this can be handled in a better way (I'm pretty sure you 
> don't need to zero out the file offset for instance), but we can wait with 
> that until we resolve the first point...

So the DWARF plug-ins definitely expect the addresses in the DWARF to be file 
addresses where if you ask the module for its section list, it can look up this 
"file address" in the list, it will find a single section. If this is not the 
case, you will need to make a special SymbolFileDWARFWasm subclass and this 
will be very tricky as any address you get from the DWARF will need to be 
converted so that the lookup the section list will work.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72751/new/

https://reviews.llvm.org/D72751



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D72751: [LLDB] Add DynamicLoaderWasmDYLD plugin for WebAssembly debugging

Reply via email to