On 14/06/2017 13:45, Alex Bennée wrote: > > Paolo Bonzini <pbonz...@redhat.com> writes: > >> On 14/06/2017 06:48, Richard Henderson wrote: >>>> >>>> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes >>>> a regression by which aarch64 guests freeze under TCG with -smp > 1, >>>> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled). >>>> >>>> I isolated the problem to the MSR handler. This patch forces an exit >>>> after the handler is executed, which fixes the regression. >>> >>> Why would that be? The cpu_get_tb_cpu_state within helper_lookup_tb_ptr >>> is supposed to read the new state that the msr handler would have >>> installed. >> >> Could some of these cause an interrupt, or some other change in the >> cpu_exec flow? > > Well what I was observing was the secondary_start_kernel stalling and > leaving the main cpu spinning. The msr is actually: > > local_irq_enable(); > local_fiq_enable(); > > Which I assume would re-enable IRQs if they are ready to go. However I > guess if we sink into our cpu_idle without exiting the main loop we > never set any pending IRQs?
Then Emilio's patch, if a bit of a heavy hammer, is correct. After aa64_daif_write needs you need an exit_tb so that arm_cpu_exec_interrupt is executed again. Compare with this from the x86 front-end: /* if irq were inhibited with HF_INHIBIT_IRQ_MASK, we clear the flag and abort the translation to give the irqs a change to be happen */ if (dc->tf || dc->singlestep_enabled || (flags & HF_INHIBIT_IRQ_MASK)) { gen_jmp_im(pc_ptr - dc->cs_base); gen_eob(dc); break; } (This triggers one instruction after a STI instruction, due to how x86's "interrupt shadow" work, so it doesn't happen immediately after helper_sti; but the idea is the same). Paolo