On Oct 20 18:54, Alex Bennée wrote: > Have you got a test case you are using so I can try and replicate the > failure you are seeing? So far by inspection everything looks OK to me.
I took some time today to put together a minimal(ish) reproducer using usermode. The source files used are below, I compiled the test binary on an AArch64 system using: $ gcc -g -o stxp stxp.s stxp.c Then built the plugin from stxp_plugin.cc, and ran it all like: qemu-aarch64 \ -cpu cortex-a57 \ -D stxp_plugin.log \ -d plugin \ -plugin 'stxp_plugin.so' \ ./stxp I observe that, for me, the objdump of stxp contains: 000000000040070c <loop>: 40070c: f9800011 prfm pstl1strm, [x0] 400710: c87f4410 ldxp x16, x17, [x0] 400714: c8300c02 stxp w16, x2, x3, [x0] 400718: f1000652 subs x18, x18, #0x1 40071c: 54000040 b.eq 400724 <done> // b.none 400720: 17fffffb b 40070c <loop> But the output in stxp_plugin.log looks something like: Executing PC: 0x40070c Executing PC: 0x400710 PC 0x400710 accessed memory at 0x550080ec70 PC 0x400710 accessed memory at 0x550080ec78 Executing PC: 0x400714 Executing PC: 0x400718 Executing PC: 0x40071c Executing PC: 0x400720 >From this, I believe the ldxp instruction at PC 0x400710 is reporting two memory accesses but the stxp instruction at 0x400714 is not. -Aaron --- stxp.c --- void stxp_issue_demo(); int main() { char arr[16]; stxp_issue_demo(&arr); } --- stxp.s --- .align 8 stxp_issue_demo: mov x18, 0x1000 mov x2, 0x0 mov x3, 0x0 loop: prfm pstl1strm, [x0] ldxp x16, x17, [x0] stxp w16, x2, x3, [x0] subs x18, x18, 1 beq done b loop done: ret .global stxp_issue_demo --- stxp_plugin.cc --- #include <stdio.h> extern "C" { #include <qemu-plugin.h> QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; void qemu_logf(const char *str, ...) { char message[1024]; va_list args; va_start(args, str); vsnprintf(message, 1023, str, args); qemu_plugin_outs(message); va_end(args); } void before_insn_cb(unsigned int cpu_index, void *udata) { uint64_t pc = (uint64_t)udata; qemu_logf("Executing PC: 0x%" PRIx64 "\n", pc); } static void mem_cb(unsigned int cpu_index, qemu_plugin_meminfo_t meminfo, uint64_t va, void *udata) { uint64_t pc = (uint64_t)udata; qemu_logf("PC 0x%" PRIx64 " accessed memory at 0x%" PRIx64 "\n", pc, va); } static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) { size_t n = qemu_plugin_tb_n_insns(tb); for (size_t i = 0; i < n; i++) { struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i); uint64_t pc = qemu_plugin_insn_vaddr(insn); qemu_plugin_register_vcpu_insn_exec_cb(insn, before_insn_cb, QEMU_PLUGIN_CB_R_REGS, (void *)pc); qemu_plugin_register_vcpu_mem_cb(insn, mem_cb, QEMU_PLUGIN_CB_NO_REGS, QEMU_PLUGIN_MEM_RW, (void*)pc); } } QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info, int argc, char **argv) { qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans); return 0; } }