eloparco added inline comments.
================ Comment at: lldb/tools/lldb-vscode/lldb-vscode.cpp:2177 + const auto max_instruction_size = g_vsc.target.GetMaximumOpcodeByteSize(); + const auto bytes_offset = -instruction_offset * max_instruction_size; + auto start_addr = base_addr - bytes_offset; ---------------- clayborg wrote: > Just checked out your changes, and you are still just subtracting a value > from the start address and attempting to disassemble from memory which is the > problem. We need to take that subtracted address, and look it up as suggested > in previous code examples I posted. If you find a function to symbol, ask > those objects for their instructions. and then try to use those. > > But basically for _any_ disassembly this is what I would recommend doing: > - first resolve the "start_address" (no matter how you come up the address) > that want to disassemble into a SBAddress > - check its section. If the section is valid and contains instructions, call > a function that will disassemble the address range for the section that > starts at "start_address" and ends at the end of the section. We can call > this "disassemble_code" as a function. More details on this below > - If the section does not contain instructions, just read the bytes and emit > a lines like: > ``` > 0x1000 .byte 0x12 > 0x1000 .byte 0x34 > ... > ``` > > Now for the disassemble_code function. We know the address range for this is > in code. We then need to resolve the address passed to "disassemble_code" > into a SBAddress and ask that address for a SBFunction or SBSymbol as I > mentioned. Then we ask the SBFunction or SBSymbol for all instructions that > they contain, and then use any instructions that fall into the range we have. > If there is no SBFunction or SBSymbol, then disassemble an instruction at a > time and then see if the new address will resolve to a function or symbol. Tried my changes on a linux x86 machine and the loop `for (unsigned i = 0; i < max_instruction_size; i++) {` (L2190) takes care of the `start_address` possibly being in the middle of an instruction, so that's not a problem. The problem I faced is that it tries to read too far from `base_addr` and the `ReadMemory()` operation returns few instructions (without reaching `base_addr`). That was not happening on my macOS M1 (arm) machine. To solve, I changed the loop at L2190 to ``` for (unsigned i = 0; i < bytes_offset; i++) { auto sb_instructions = _get_instructions_from_memory(start_addr + i, disassemble_bytes); ``` and if `start_addr` is in `sb_instructions` we're done and can exit the loop. That worked. Another similar thing that can be done is to start from `start_sbaddr` as you were saying, increment the address until a valid section is found. Then call `_get_instructions_from_memory()` passing the section start. What do you think? Delegating the disassembling to `ReadMemory()` + `GetInstructions()` looks simpler to me than to manually iterate over sections and get instructions from symbols and functions. Is there any shortcoming I'm not seeing? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140358/new/ https://reviews.llvm.org/D140358 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits