On 11/22/2017 03:33 AM, David Gibson wrote: > On Mon, Nov 20, 2017 at 11:03:45AM +0100, Cédric Le Goater wrote: >> When a CPU is stopped with the 'stop-self' RTAS call, its state >> 'halted' is switched to 1 and, in this case, the MSR is not taken into >> account anymore in the cpu_has_work() routine. Only the pending >> hardware interrupts are checked with their LPCR:PECE* enablement bit. >> >> If the DECR timer fires after 'stop-self' is called and before the CPU >> 'stop' state is reached, the nearly-dead CPU will have some work to do >> and the guest will crash. This case happens very frequently with the >> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is >> occasionally fired but after 'stop' state, so no work is to be done >> and the guest survives. >> >> I suspect there is a race between the QEMU mainloop triggering the >> timers and the TCG CPU thread but I could not quite identify the root >> cause. To be safe, let's disable in the LPCR all the exceptions which >> can cause an exit while the CPU is in power-saving mode and reenable >> them when the CPU is started. >> >> For this purpose, we introduce a little helper routine to calculate >> the PECE bits for a processor variant. We could also use the mask >> value LPCR_PECE_L_MASK for the P8 and P9 processors. bit 47 and 48 are >> reserved on P7 but it is still compatible. >> >> Signed-off-by: Cédric Le Goater <c...@kaod.org> > > I'm not thrilled about addressing this without 100% knowing what's > going on,
me either :/ I have spent hours, days, on QEMU logs trying to catch a possible race in the state machine of the CPUs and didn't find it. I need a better understanding of the internals. > but this seems like a sensible change in any case, so I'm ok > with applying something like this. > > A detail however.. > > [snip] >> #if !defined(CONFIG_USER_ONLY) >> + >> +target_ulong cpu_ppc_papr_pece_bits(CPUPPCState *env) >> +{ >> + switch (env->mmu_model) { >> + case POWERPC_MMU_3_00: >> + return LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE | LPCR_OEE; >> + default: >> + /* P7 and P8 has slightly different PECE bits, mostly because P8 >> adds >> + * bit 47 and 48 which are reserved on P7. Here we set them all, >> which >> + * will work as expected for both implementations >> + */ >> + return LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 | >> LPCR_P8_PECE3 | >> + LPCR_P8_PECE4; >> + } >> +} > > ..since we're working in this area, might as well clean up this > inappropriate use of mmu_model. Two options which I'd be ok with: > > 1) Add a pece_bits field to the PowerPCCPUClass, correctly initialized > for the various processors. > > 2) A similar helper but using ppc_check_compat() to check the arch > level, instead of using env->mmu_model. > OK. Thanks, C.