> > What is the right way to get virtual address of either translation block or > > instruction inside of TCG plugin? Does > > plugin API allow that or it needs some extension? > > > > So far I use qemu_plugin_tb_vaddr() inside of my block translation callback > > to get block virtual address and then > > pass it as 'userdata' argument into qemu_plugin_register_vcpu_tb_exec_cb(). > > I use it later during code execution. > > It works well for user-mode emulation, but sometimes leads to > > incorrect addresses in system-mode emulation. > > You can use qemu_plugin_insn_vaddr and qemu_plugin_insn_haddr. But your > right something under one vaddr and be executed under another with > overlapping mappings. The haddr should be stable though I think.
As far as I see haddr is ok and can be used to identify blocks. However, if I have haddr at block execution phase and I want to know vaddr, there is no API to get such mapping. Maybe it is possible to extract from software MMU, but I have no clue where to start with. > > I suspect it is because of memory mappings by guest OS that changes virtual > > addresses for that block. > > > > I also looked at gen_empty_udata_cb() function and considered to extend > > plugin API to pass a program counter > > value as additional callback argument. I thought it would always give me > > valid virtual address of an instruction. > > Unfortunately, I didn't find a way to get value of that register in > > architecture agnostic way (it is 'pc' member in > > CPUArchState structure). > > When we merge the register api you should be able to do that. Although > during testing I realised that PC acted funny compared to everything > else because we don't actually update the shadow register every > instruction. We implemented similar API to read registers (by coincidence, I posted this patch at the same time as the API you mentioned) and I observe similar behavior. As far as I see, CPU state is only updated in between of executed translation blocks. Switching to 'singlestep' mode helps to fix that, but execution overhead is huge. There is also blocks 'chaining' mechanism which is likely contributes to corrupted blocks vaddr inside of callbacks. My guess is that 'pc' value for those chained blocks points to the first block of entire chain. Unfortunately, It is very hard to debug, because I can only see block chains when I run whole Linux guest OS. Does Qemu has small test application to trigger long enough chain of translation blocks? Having those complexities makes me think to inject appropriate code into translation blocks to compute actual block vaddr at execution stage. The problem here is to find a variable where I can load 'pc' at start of translation block.