Alex Bennée <alex.ben...@linaro.org> writes: > This is a simple control flow tracking plugin that uses the latest > inline and conditional operations to detect and track control flow > changes. It is currently an exercise at seeing how useful the changes > are. > > Based-on: <20240312075428.244210-1-pierrick.bouv...@linaro.org> > Cc: Gustavo Romero <gustavo.rom...@linaro.org> > Cc: Pierrick Bouvier <pierrick.bouv...@linaro.org> > Signed-off-by: Alex Bennée <alex.ben...@linaro.org> > Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org> > > --- > v2 > - only need a single call back > - drop need for INSN_WIDTH > - still don't understand the early exits > > I'm still seeing weirdness in the generated code, for example the > plugin reports "early exits" which doesn't make sense for a ret which > terminates a block: > > addr: 0x403c88 hexchar: ret (1/1) > early exits 1280 > branches 1280 > to 0x403d00 (639) > to 0x403d24 (639) > <snip> > + > +/* > + * At the start of each block we need to resolve two things: > + * > + * - is last_pc == block_end, if not we had an early exit > + * - is start of block last_pc + insn width, if not we jumped > + * > + * Once those are dealt with we can instrument the rest of the > + * instructions for their execution. > + * > + */ > +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) > +{ > + uint64_t pc = qemu_plugin_tb_vaddr(tb); > + size_t insns = qemu_plugin_tb_n_insns(tb); > + struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - > 1); > + > + /* > + * check if we are executing linearly after the last block. We can > + * handle both early block exits and normal branches in the > + * callback if we hit it. > + */ > + gpointer udata = GUINT_TO_POINTER(pc); > + qemu_plugin_register_vcpu_tb_exec_cond_cb( > + tb, vcpu_tb_branched_exec, QEMU_PLUGIN_CB_NO_REGS, > + QEMU_PLUGIN_COND_NE, pc_after_block, pc, udata); > + > + /* > + * Now we can set start/end for this block so the next block can > + * check where we are at. > + */ > + qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb, > + > QEMU_PLUGIN_INLINE_STORE_U64, > + end_block, > qemu_plugin_insn_vaddr(last_insn)); > + qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb, > + > QEMU_PLUGIN_INLINE_STORE_U64, > + pc_after_block, > + > qemu_plugin_insn_vaddr(last_insn) + > + > qemu_plugin_insn_size(last_insn));
With the following: modified contrib/plugins/cflow.c @@ -220,7 +220,7 @@ static void vcpu_tb_branched_exec(unsigned int cpu_index, void *udata) g_mutex_lock(&node->lock); if (early_exit) { - fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64" + fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64 " npc=%"PRIx64", lpc=%"PRIx64", \n", __func__, pc, ebpc, npc, lpc); node->early_exit++; @@ -264,6 +264,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) { uint64_t pc = qemu_plugin_tb_vaddr(tb); size_t insns = qemu_plugin_tb_n_insns(tb); + struct qemu_plugin_insn *first_insn = qemu_plugin_tb_get_insn(tb, 0); struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 1); /* @@ -278,12 +279,13 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) /* * Now we can set start/end for this block so the next block can - * check where we are at. + * check where we are at. Do this on the first instruction and not + * the TB so we don't get mixed up with above. */ - qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb, + qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn, QEMU_PLUGIN_INLINE_STORE_U64, end_block, qemu_plugin_insn_vaddr(last_insn)); - qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb, + qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn, QEMU_PLUGIN_INLINE_STORE_U64, pc_after_block, qemu_plugin_insn_vaddr(last_insn) + The report looks more sane: collected 9013 control flow nodes in the hash table addr: 0x403c88 hexchar: ret (0/1) branches 1280 to 0x403d00 (639) to 0x403d24 (639) So I think we need to think about preserving the ordering of instrumentation (at least from the same plugin) so we are not laying any API bear traps for users. I assume because loads and stores are involved we won't see the optimiser trying to swap stuff around. -- Alex Bennée Virtualisation Tech Lead @ Linaro