On 06/06/2016 06:17 AM, Benjamin Herrenschmidt wrote: > On Sun, 2016-06-05 at 19:41 +0200, Cédric Le Goater wrote: >> >> Here is a fix I think. Could you give it a try ? > > This is somewhat wrong... > >> commit cd0c6f473532 ('ppc: Do some batching of TCG tlb flushes') >> introduced an optimisation to flush TLBs only when a context >> synchronizing event is reached (interrupt, rfi). This was done for >> ppc64 but 32bit was forgotten on the way. > > No it didn't. That commit only delays flushes on ppc64. ppc32 is > unaffected, unless I missed something. IE. It will delay flushes caused > by slb instructions (which don't exist on 32-bit) > and ppc_tlb_invalidate_one() only in the 64-bit cases. > > Also what your patch does in practice is not really change that, though > you seem to try to somewhat extend the batching to 32-bit (but > incompletely), you also introduce something which effectively reverts > part of 9fb044911444fdd09f5f072ad0ca269d7f8b841d (split I/D mode). > > I think that's more what's "fixing" your problem, ie, the flush in > IR/DR changes. However it shouldn't be needed.
OK. I thought that was needed because of what the 32b specs say in "Synchronization Requirements for Special Registers and for Lookaside Buffers", a "Context-synchronizing instruction" is required after a mtmsr({I,D}R) and also because changing IR can break implicit branching. But I might just misunderstand all of it as I am discovering. > I suspect all of that is papering over another bug somewhere else which > got exposed by the split I/D mode, since we no longer over-flush on > transitions to/from real-mode. So we must be missing flushes elsewhere, > possibly some G3 specific stuff, or there always was some kind of bug > in the TLB flushing on 32-bit that got somewhat masked by the over- > flushing we used to do. >From what I see, darwin loops on tlbie : 0x000952fc: mtctr r0 0x00095300: tlbie r6 0x00095304: addi r6,r6,4096 0x00095308: bdnz+ 0x95300 0x0009530c: mtcrf 128,r11 0x00095310: sync 0x00095314: eieio 0x00095318: bns- 0x95328 and this is done on the G4, but not necessarily on the G3, it depends on r11 which contains some bits of SPRG2 : 0x0009531c: tlbsync 0x00095320: sync 0x00095324: isync HID0 is also read and written to but to control cache bits. > I need a repro-case. Booting the darwin CD is enough. Cheers, C. > Cheers, > Ben. > >> Tested on mac99 and g3beige with >> >> qemu-system-ppc -cdrom darwinppc-602.cdr -boot d >> >> Signed-off-by: Cédric Le Goater <c...@kaod.org> >> --- >> >> I think the hunk in powerpc_excp() is needed if we don't generate a >> context synchronizing event. what is best to do ? >> >> target-ppc/cpu.h | 2 +- >> target-ppc/excp_helper.c | 10 ++++++++++ >> target-ppc/helper_regs.h | 9 ++++++++- >> target-ppc/translate.c | 2 +- >> 4 files changed, 20 insertions(+), 3 deletions(-) >> >> Index: qemu-dgibson-for-2.7.git/target-ppc/translate.c >> =================================================================== >> --- qemu-dgibson-for-2.7.git.orig/target-ppc/translate.c >> +++ qemu-dgibson-for-2.7.git/target-ppc/translate.c >> @@ -3290,7 +3290,7 @@ static void gen_eieio(DisasContext *ctx) >> { >> } >> >> -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) >> +#if !defined(CONFIG_USER_ONLY) >> static inline void gen_check_tlb_flush(DisasContext *ctx) >> { >> TCGv_i32 t = tcg_temp_new_i32(); >> Index: qemu-dgibson-for-2.7.git/target-ppc/cpu.h >> =================================================================== >> --- qemu-dgibson-for-2.7.git.orig/target-ppc/cpu.h >> +++ qemu-dgibson-for-2.7.git/target-ppc/cpu.h >> @@ -958,9 +958,9 @@ struct CPUPPCState { >> /* PowerPC 64 SLB area */ >> ppc_slb_t slb[MAX_SLB_ENTRIES]; >> int32_t slb_nr; >> +#endif >> /* tcg TLB needs flush (deferred slb inval instruction >> typically) */ >> uint32_t tlb_need_flush; >> -#endif >> /* segment registers */ >> hwaddr htab_base; >> /* mask used to normalize hash value to PTEG index */ >> Index: qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h >> =================================================================== >> --- qemu-dgibson-for-2.7.git.orig/target-ppc/helper_regs.h >> +++ qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h >> @@ -121,6 +121,13 @@ static inline int hreg_store_msr(CPUPPCS >> } >> if (((value >> MSR_IR) & 1) != msr_ir || >> ((value >> MSR_DR) & 1) != msr_dr) { >> + /* A change of the instruction relocation bit in the MSR can >> + * cause an implicit branch in the address space. This >> + * requires a tlb flush. >> + */ >> + if (env->mmu_model & POWERPC_MMU_32B) { >> + env->tlb_need_flush = 1; >> + } >> cs->interrupt_request |= CPU_INTERRUPT_EXITTB; >> } >> if ((env->mmu_model & POWERPC_MMU_BOOKE) && >> @@ -151,7 +158,7 @@ static inline int hreg_store_msr(CPUPPCS >> return excp; >> } >> >> -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) >> +#if !defined(CONFIG_USER_ONLY) >> static inline void check_tlb_flush(CPUPPCState *env) >> { >> CPUState *cs = CPU(ppc_env_get_cpu(env)); >> Index: qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c >> =================================================================== >> --- qemu-dgibson-for-2.7.git.orig/target-ppc/excp_helper.c >> +++ qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c >> @@ -709,6 +709,16 @@ static inline void powerpc_excp(PowerPCC >> } >> } >> #endif >> + if (((new_msr >> MSR_IR) & 1) != msr_ir || >> + ((new_msr >> MSR_DR) & 1) != msr_dr) { >> + /* A change of the instruction relocation bit in the MSR can >> + * cause an implicit branch in the address space. This >> + * requires a tlb flush. >> + */ >> + if (env->mmu_model & POWERPC_MMU_32B) { >> + env->tlb_need_flush = 1; >> + } >> + } >> /* We don't use hreg_store_msr here as already have treated >> * any special case that could occur. Just store MSR and update >> hflags >> * >>