labath added a subscriber: aadsm.
labath added a comment.

[ looping in @aadsm for the svr4 stuff ]

Thanks for adding the test, and for the detailed writeup. Please find my 
comments inline.

In D72751#1835502 <https://reviews.llvm.org/D72751#1835502>, @paolosev wrote:

> The first is that we need to call `Target::SetSectionLoadAddress()` twice, 
> from two different places. First we need to call 
> `Target::SetSectionLoadAddress()` in `ObjectFileWasm::SetLoadAddress()`, and 
> then again in `DynamicLoaderWasmDYLD::DidAttach()`. The reason for this seems 
> to originate in the sequence of function calls:
>
> In `DynamicLoaderWasmDYLD::DidAttach()` we call 
> `ProcessGDBRemote::LoadModules()` to get list of loaded modules from the 
> remote (Wasm engine).
>  `ProcessGDBRemote::LoadModules()` calls, first:
>
> - `DynamicLoaderWasmDYLD::LoadModuleAtAddress()` and from there: ...
>
>   then:
> - `Target::SetExecutableModule() -> Target::ClearModules() -> 
> SectionLoadList::Clear()`
>
>   So, at the end of `LoadModules()` in `DynamicLoaderWasmDYLD::DidAttach()` 
> the SectionLoadList is empty, and we need to set it again by calling 
> `Target::.SetSectionLoadAddress()` again. This works but the duplication is 
> ugly; is there a way to improve this?


I hope so. :) This seems like a bug in `ProcessGDBRemote::LoadModules`. It 
seems wrong/wasteful/etc to do all this work to compute the section load 
addresses only to have them be thrown away by `SetExecutableModule`. Maybe all 
it would take is to reverse the order of these two actions, so that the load 
addresses persist? Can you try something like that?

On a side note, `ProcessGDBRemote::LoadModules` seems a bit internally 
inconsistent. At one place it claims that "The main executable will never be 
included in libraries-svr4", but then it goes on to set an executable module 
anyway. This could in fact be a clue as to why this problem hasn't showed up on 
other platforms -- if the remote does not send the executable, then 
SetExecutableModule is not called.

On a side-side note, this may mean that you sending the main wasm file through 
`qXfer:libraries-svr4` may not be correct. However, fixing that would mean 
finding another way to communicate the main executable name/address. I don't 
think we currently have an easy way to do that so it may be better to fix 
`ProcessGDBRemote::LoadModules`, given that it "almost" supports executables.

I am also worried about the fact that `SymbolFileDWARF::CalculateAbilities` 
requires the module to be "loaded". That shouldn't be normally required. That 
function does a very basic check on some sections, and this should work fine 
without those sections having a "load address", even if they are actually being 
loaded from target memory. I think this means there are some additional places 
where ObjectFileWasm should use `m_memory_addr` instead of something else...

> The second problem is that the Code Section needs to be initialized (in 
> `ObjectFileWasm::CreateSections()`) with `m_file_addr = m_file_offset = 0`, 
> and not with the actual file offset of the Code section in the Wasm file. If 
> we set `Section::m_file_addr` and `Section::m_file_offset` to the actual code 
> offset, the DWARF info does not work correctly.
> 
> I have some doubts regarding the DWARF data generated by Clang for a Wasm 
> target. Looking at an example, for a Wasm module that has the Code section at 
> offset 0x57, I see this DWARF data:
> 
>   0x0000000b: DW_TAG_compile_unit
>                 […]
>                 DW_AT_low_pc (0x0000000000000000)
>                 DW_AT_ranges (0x00000000
>                    [0x00000002, 0x0000000e)
>                    [0x0000000f, 0x0000001a)
>                    [0x0000001b, 0x00000099)
>                    [0x0000009b, 0x0000011c))
> 
> 
> The documentation <https://yurydelendik.github.io/webassembly-dwarf/#pc> says 
> that //“Wherever a code address is used in DWARF for WebAssembly, it must be 
> the offset of an instruction relative within the Code section of the 
> WebAssembly file.”// 
>  But is this correct? Shouldn't maybe code addresses be offset-ed by the file 
> address of the Code section?

That's interesting. I don't think that clang is really wrong/non-conforming 
here, but this choice plays a rather poorly with the way lldb handles object 
files (and how your remote presents them). In this case special casing the code 
section may be fine, but you should be aware that there are places in lldb 
which expect that the "load bias" (the delta between file and load addresses) 
is the same for all sections in a module (because that's how it works 
elsewhere), and they may not like this. However, if you have only one code 
section, I think you should be mostly fine.

I do think that this can be handled in a better way (I'm pretty sure you don't 
need to zero out the file offset for instance), but we can wait with that until 
we resolve the first point...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72751/new/

https://reviews.llvm.org/D72751



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to