eloparco added inline comments.
================ Comment at: lldb/tools/lldb-vscode/lldb-vscode.cpp:2177 + const auto bytes_offset = -instruction_offset * bytes_per_instruction; + auto start_addr = base_addr - bytes_offset; + const auto disassemble_bytes = instruction_count * bytes_per_instruction; ---------------- clayborg wrote: > This will work if the min and max opcode byte size are the same, like for > arm64, the min and max are 4. This won't work for x86 or arm32 in thumb mode. > So when backing up, we need to do an address lookup on the address we think > we want to go to, and then adjust the starting address accordingly. Something > like: > > ``` > SBAddress start_sbaddr = (base_addr - bytes_offset, g_vsc.target); > ``` > now we have a section offset address that can tell us more about what it is. > We can find the SBFunction or SBSymbol for this address and use those to find > the right instructions. This will allow us to correctly disassemble code > bytes. > > We can also look at the section that the memory comes from and see what the > section contains. If the section is data, then emit something like: > ``` > 0x00001000 .byte 0x23 > 0x00001001 .byte 0x34 > ... > ``` > To find the section type we can do: > ``` > SBSection section = start_sbaddr.GetSection(); > if (section.IsValid() && section.GetSectionType() == lldb::eSectionTypeCode) { > // Disassemble from a valid boundary > } else { > // Emit a byte or long at a time with ".byte 0xXX" or other ASM directive > for binary data > } > ``` > We need to ensure we start disassembling on the correct instruction boundary > as well as our math for "start_addr" might be in between actual opcode > boundaries. If we are in a lldb::eSectionTypeCode, then we know we have > instructions, and if we are not, then we can emit ".byte" or other binary > data directives. So if we do have lldb::eSectionTypeCode as our section type, > then we should have a function or symbol, and we can get instructions from > those objects easily: > > ``` > if (section.IsValid() && section.GetSectionType() == lldb::eSectionTypeCode) { > lldb::SBInstructionList instructions; > lldb::SBFunction function = start_sbaddr.GetFunction(); > if (function.IsValid()) { > instructions = function.GetInstructions(g_vsc.target); > } else { > symbol = start_sbaddr.GetSymbol(); > if (symbol.IsValid()) > instructions = symbol.GetInstructions(g_vsc.target); > } > const size_t num_instrs = instructions.GetSize(); > if (num_instrs > 0) { > // we found instructions from a function or symbol and we need to > // find the matching instruction that we want to start from by iterating > // over the instructions and finding the right address > size_t matching_idx = num_instrs; // Invalid index > for (size_t i=0; i<num_instrs; ++i) { > lldb::SBInstruction inst = instructions.GetInstructionAtIndex(i); > if (inst.GetAddress().GetLoadAddress(g_vsc.target) >= start_addr) { > matching_idx = i; > break; > } > } > if (matching_idx < num_instrs) { > // Now we can print the instructions from [matching_idx, num_instrs) > // then we need to repeat the search for the next function or symbol. > // note there may be bytes between functions or symbols which we can > disassemble > // by calling _get_instructions_from_memory(...) but we must find the next > // symbol or function boundary and get back on track > } > > ``` Sorry, I should have provided a proper explanation. I use the maximum instruction size as the "worst case". Basically, I need to read a portion of memory but I do not know the start address and the size. For the start address, if I want to read N instructions before `base_addr` I need to read at least starting from `base_addr - N * max_instruction_size`: if all instructions are of size `max_instruction_size` I will read exactly N instructions; otherwise I will read more than N instructions and prune the additional ones afterwards. Same for applies for the size. Since `start_addr` is based on a "worst case", it may be an address in the middle of an instruction. In that case, that first instruction will be misinterpreted, but I think that is negligible. The logic is similar to what is implemented in other VS Code extensions, like https://github.com/vadimcn/vscode-lldb/blob/master/adapter/src/debug_session.rs#L1134. Does it make sense? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140358/new/ https://reviews.llvm.org/D140358 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits