Hello, While looking at tcg/i386/tcg-target.c.inc:tcg_out_qemu_st(), I discovered that the TCG generates a call to a store helper at the end of the TB which is executed on TLB miss and get back to the remaining translated ops. I tried to mimick this behavior around the fast path (right between tcg_out_tlb_load() and tcg_out_qemu_st_direct()) to filter on memory store accesses.
I know there is now TCG plugins for that purpose at TCG IR level, which every tcg-target might benefit. FWIW, my design choice was more led by the fact that I always work on an x86 host and plugins did not exist by the time. Anyway, the point is more related to generating a call to a helper at the TCG IR level (classic scenario), or later during tcg-target code generation (slow path for instance). The TCG when calling a helper knows that some registers will be call clobbered and as such must free them. This is what I observed in tcg_reg_alloc_call(): /* clobber call registers */ for (i = 0; i < TCG_TARGET_NB_REGS; i++) { if (tcg_regset_test_reg(tcg_target_call_clobber_regs, i)) { tcg_reg_free(s, i, allocated_regs); } } But in our case (ie. INDEX_op_qemu_st_i32), the TCG code path comes from: tcg_reg_alloc_op() tcg_out_op() tcg_out_qemu_st() Then tcg_out_tlb_load() will inject a 'jmp' to the slow path, whose generated code does not seem to take care of every call clobbered registers, if we look at tcg_out_qemu_st_slow_path(). First for an i386 (32bits) tcg-target, as expected, the helper arguments are injected into the stack. I noticed that 'esp' is not shifted down before stacking up the args, which might corrupt last stacked words. Second, for both 32/64 bits tcg-targets since all of the 'call clobbered' registers are not preserved, it may happen that depending on the code executed by the helper (and so generated by GCC) these registers will be clobbered (ie. R10 for x86-64). While this never happened for the slow path helper call, I observed that my guest had trouble running when filtering memory in the same fashion the slow path helper would be called. Conversely, if I push/pop all of the call clobbered regs around the call to the helper, everything runs as expected. Is this correct ? Am I missing something ? Thanks a lot in advance for your eagle eye on this :)