Hi Richard, On Sat, Apr 29, 2023 at 10:28:03AM +0100, Richard Henderson wrote: > On 4/27/23 03:09, Jamie Iles wrote: > > From: Jamie Iles <ji...@qti.qualcomm.com> > > > > The round-robin scheduler will iterate over the CPU list with an > > assigned budget until the next timer expiry and may exit early because > > of a TB exit. This is fine under normal operation but with icount > > enabled and SMP it is possible for a CPU to be starved of run time and > > the system live-locks. > > > > For example, booting a riscv64 platform with '-icount > > shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel > > has timers enabled and starts performing TLB shootdowns. In this case > > we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU > > 1. As we enter the TCG loop, we assign the icount budget to next timer > > interrupt to CPU 0 and begin executing where the guest is sat in a busy > > loop exhausting all of the budget before we try to execute CPU 1 which > > is the target of the IPI but CPU 1 is left with no budget with which to > > execute and the process repeats. > > > > We try here to add some fairness by splitting the budget across all of > > the CPUs on the thread fairly before entering each one. The CPU count > > is cached on CPU list generation ID to avoid iterating the list on each > > loop iteration. With this change it is possible to boot an SMP rv64 > > guest with icount enabled and no hangs. > > > > New in v3 (address feedback from Richard Henderson): > > - Additional patch to use QEMU_LOCK_GUARD with qemu_cpu_list_lock where > > appropriate > > - Move rr_cpu_count() call to be conditional on icount_enabled() > > - Initialize cpu_budget to 0 > > > > Jamie Iles (2): > > cpu: expose qemu_cpu_list_lock for lock-guard use > > accel/tcg/tcg-accel-ops-rr: ensure fairness with icount > > It appears as if one of these two patches causes a failure in replay, e.g. > > https://gitlab.com/rth7680/qemu/-/jobs/4200609234#L4162 > > Would you have a look, please?
I was out on vacation and it looks like this job got cleaned up, but was this a mutex check failing for the replay mutex and/or iothread mutex? If so then the following patch fixes it for me which I can include in a v4 Thanks, Jamie diff --git i/accel/tcg/tcg-accel-ops-icount.c w/accel/tcg/tcg-accel-ops-icount.c index e1e8afaf2f99..3d2cfbbc9778 100644 --- i/accel/tcg/tcg-accel-ops-icount.c +++ w/accel/tcg/tcg-accel-ops-icount.c @@ -114,13 +114,13 @@ void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget) g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0); g_assert(cpu->icount_extra == 0); + replay_mutex_lock(); + cpu->icount_budget = MIN(icount_get_limit(), cpu_budget); insns_left = MIN(0xffff, cpu->icount_budget); cpu_neg(cpu)->icount_decr.u16.low = insns_left; cpu->icount_extra = cpu->icount_budget - insns_left; - replay_mutex_lock(); - if (cpu->icount_budget == 0) { /* * We're called without the iothread lock, so must take it while diff --git i/replay/replay.c w/replay/replay.c index c39156c52221..0f7d766efe81 100644 --- i/replay/replay.c +++ w/replay/replay.c @@ -74,7 +74,7 @@ uint64_t replay_get_current_icount(void) int replay_get_instructions(void) { int res = 0; - replay_mutex_lock(); + g_assert(replay_mutex_locked()); if (replay_next_event_is(EVENT_INSTRUCTION)) { res = replay_state.instruction_count; if (replay_break_icount != -1LL) { @@ -85,7 +85,6 @@ int replay_get_instructions(void) } } } - replay_mutex_unlock(); return res; }