eloparco added inline comments.

================
Comment at: lldb/tools/lldb-vscode/lldb-vscode.cpp:2177
+  const auto max_instruction_size = g_vsc.target.GetMaximumOpcodeByteSize();
+  const auto bytes_offset = -instruction_offset * max_instruction_size;
+  auto start_addr = base_addr - bytes_offset;
----------------
clayborg wrote:
> Just checked out your changes, and you are still just subtracting a value 
> from the start address and attempting to disassemble from memory which is the 
> problem. We need to take that subtracted address, and look it up as suggested 
> in previous code examples I posted. If you find a function to symbol, ask 
> those objects for their instructions. and then try to use those. 
> 
> But basically for _any_ disassembly this is what I would recommend doing:
> - first resolve the "start_address" (no matter how you come up the address) 
> that want to disassemble into a SBAddress
> - check its section. If the section is valid and contains instructions, call 
> a function that will disassemble the address range for the section that 
> starts at "start_address" and ends at the end of the section. We can call 
> this "disassemble_code" as a function. More details on this below
> - If the section does not contain instructions, just read the bytes and emit 
> a lines like:
> ```
> 0x1000 .byte 0x12
> 0x1000 .byte 0x34
> ...
> ```
> 
> Now for the disassemble_code function. We know the address range for this is 
> in code. We then need to resolve the address passed to "disassemble_code" 
> into a SBAddress and ask that address for a SBFunction or SBSymbol as I 
> mentioned. Then we ask the SBFunction or SBSymbol for all instructions that 
> they contain, and then use any instructions that fall into the range we have. 
> If there is no SBFunction or SBSymbol, then disassemble an instruction at a 
> time and then see if the new address will resolve to a function or symbol.
Tried my changes on a linux x86 machine and the loop `for (unsigned i = 0; i < 
max_instruction_size; i++) {` (L2190) takes care of the `start_address` 
possibly being in the middle of an instruction, so that's not a problem.  The 
problem I faced is that it tries to read too far from `base_addr` and the 
`ReadMemory()` operation returns few instructions (without reaching 
`base_addr`). That was not happening on my macOS M1 (arm) machine. 

To solve, I changed the loop at L2190 to
```
for (unsigned i = 0; i < bytes_offset; i++) {
    auto sb_instructions =
        _get_instructions_from_memory(start_addr + i, disassemble_bytes);
```
and if `start_addr` is in `sb_instructions` we're done and can exit the loop. 
That worked.

Another similar thing that can be done is to start from `start_sbaddr` as you 
were saying, increment the address until a valid section is found. Then call 
`_get_instructions_from_memory()` passing the section start.
What do you think? Delegating the disassembling to `ReadMemory()` + 
`GetInstructions()` looks simpler to me than to manually iterate over sections 
and get instructions from symbols and functions.
Is there any shortcoming I'm not seeing?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140358/new/

https://reviews.llvm.org/D140358

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to