On 26/06/2015 16:47, fred.kon...@greensocs.com wrote: > From: KONRAD Frederic <fred.kon...@greensocs.com> > > Instead of doing the jump cache invalidation directly in tb_invalidate delay > it > after the exit so we don't have an other CPU trying to execute the code being > invalidated. > > Signed-off-by: KONRAD Frederic <fred.kon...@greensocs.com> > --- > translate-all.c | 61 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 59 insertions(+), 2 deletions(-) > > diff --git a/translate-all.c b/translate-all.c > index ade2269..468648d 100644 > --- a/translate-all.c > +++ b/translate-all.c > @@ -61,6 +61,7 @@ > #include "translate-all.h" > #include "qemu/bitmap.h" > #include "qemu/timer.h" > +#include "sysemu/cpus.h" > > //#define DEBUG_TB_INVALIDATE > //#define DEBUG_FLUSH > @@ -966,14 +967,58 @@ static inline void tb_reset_jump(TranslationBlock *tb, > int n) > tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr + > tb->tb_next_offset[n])); > } > > +struct CPUDiscardTBParams { > + CPUState *cpu; > + TranslationBlock *tb; > +}; > + > +static void cpu_discard_tb_from_jmp_cache(void *opaque) > +{ > + unsigned int h; > + struct CPUDiscardTBParams *params = opaque; > + > + h = tb_jmp_cache_hash_func(params->tb->pc); > + if (params->cpu->tb_jmp_cache[h] == params->tb) { > + params->cpu->tb_jmp_cache[h] = NULL; > + }
It is a bit more tricky, but I think you can avoid async_run_on_cpu by doing this: 1) introduce a QemuSeqLock in TBContext, e.g. invalidate_seqlock. 2) wrap this "if" with seqlock_write_lock/unlock 3) in cpu-exec.c do this: /* we add the TB in the virtual pc hash table */ + idx = seqlock_read_begin(&tcg_ctx.tb_ctx.invalidate_seqlock); cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb; + if (seqlock_read_retry(&tcg_ctx.tb_ctx.invalidate_seqlock)) { + /* Another CPU invalidated a tb in the meanwhile. We do not + * know if it's this one, but play it safe and avoid caching + * it. + */ + cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = NULL; + } > + /* suppress this TB from the two jump lists */ > + tb_jmp_remove(tb, 0); > + tb_jmp_remove(tb, 1); If you do the above synchronously, this part doesn't need to be deferred either. Then, immediately after the two tb_jmp_remove calls you can also check whether "(tb->jmp_first & 3) == 2": if so, the expensive expensive async_run_safe_work_on_cpu can be skipped. Paolo > +#endif /* MTTCG */ > > tcg_ctx.tb_ctx.tb_phys_invalidate_count++; > tb_unlock(); >