On Tue, 10 Jul 2018 16:36:34 +0530 Gautham R Shenoy <e...@linux.vnet.ibm.com> wrote:
> Hello Nicholas, > > > On Mon, Jul 09, 2018 at 12:24:36AM +1000, Nicholas Piggin wrote: > > Reimplement POWER9 idle code in C, in the powernv platform code. > > Assembly stubs are used to save and restore the stack frame and > > non-volatile GPRs before going to idle, but these are small and > > mostly agnostic to microarchitecture implementation details. > > > > Thanks for this patch. It is indeed hard to maneuver through the > current assembly code and change things without introducing new bugs. Hey thanks for the very good review. I don't mean to disparage the existing asm code too much :) We were doing our best there to get something working, now I think it's a good time to step back and see what we can improve. > > POWER7/8 code is not converted (yet), but that's not a moving > > target, and it doesn't make you want to claw your eyes out so > > much with the POWER9 code untangled from it. > > > > The optimisation where EC=ESL=0 idle modes did not have to save > > GPRs or mtmsrd L=0 is restored, because it's simple to do. > > > > Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs, > > but saves and restores them all explicitly. > > Right! The ->cpu_restore call in the current code is rendered > ineffective by "restore_additional_sprs" which is called after the > cpu_restore. Yes, I would actually like to do an initial incremental patch for this, and remove it from P7/8 CPU types as well. I think that's quite easy to do, and a useful cleanup to remove it entirely. > > diff --git a/arch/powerpc/include/asm/cpuidle.h > > b/arch/powerpc/include/asm/cpuidle.h > > index e210a83eb196..b668f030d531 100644 > > --- a/arch/powerpc/include/asm/cpuidle.h > > +++ b/arch/powerpc/include/asm/cpuidle.h > > @@ -28,6 +28,7 @@ > > * yet woken from the winkle state. > > */ > > #define PNV_CORE_IDLE_LOCK_BIT 0x10000000 > > +#define NR_PNV_CORE_IDLE_LOCK_BIT 28 > > We can define PNV_CORE_IDLE_LOCK_BIT mask based on > NR_PNV_CORE_IDLE_LOCK_BIT ? Yep good point. > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > > b/arch/powerpc/kernel/exceptions-64s.S > > index 76a14702cb9c..36b5f0e18c0c 100644 > > --- a/arch/powerpc/kernel/exceptions-64s.S > > +++ b/arch/powerpc/kernel/exceptions-64s.S > > @@ -135,8 +135,14 @@ TRAMP_KVM(PACA_EXNMI, 0x100) > > > > #ifdef CONFIG_PPC_P7_NAP > > EXC_COMMON_BEGIN(system_reset_idle_common) > > +BEGIN_FTR_SECTION > > + mfspr r3,SPRN_SRR1 > > + bltlr cr3 /* no state loss, return to idle caller */ > > This is a nice optimization. But perhaps needs a carefully checked. > We wakeup at SYSTEM_RESET only when we have executed stop via the > mayloss() call which creates an additional stack frame. When we return > back to the caller, we will go back with that stack frame. Perhaps we > need to jump to a isa3_idle_wake_gpr_noloss ? I wasn't 100% sure of all this so yes it's something I do need to verify carefully. I did test a few different cases in mambo and stepped through things, but could easily have bugs. I could have got something wrong, but I think mayloss does not create an additional stack frame, so this works okay. It's just using red zone. ... I think... > > +_GLOBAL(isa3_idle_stop_mayloss) > > + std r1,PACAR1(r13) > > + mflr r4 > > + mfcr r5 > > + /* use stack red zone rather than a new frame */ > > + addi r6,r1,-INT_FRAME_SIZE > > + SAVE_GPR(2, r6) > > + SAVE_NVGPRS(r6) > > + std r4,_LINK(r6) > > + std r5,_CCR(r6) > > + mtspr SPRN_PSSCR,r3 > > + PPC_STOP > > > Suppose we enter into a isa3_idle_stop_may_loss (ESL=EC=1), but as > soon as we execute PPC_STOP, if we get interrupted, we will go back to > SYSTEM_RESET vector. > > The SRR1[46:47] should be 0x1, due to which we will do a bl there, > thereby going back to the caller of isa3_idle_stop_mayloss. Which > implies that we need to perform the stack unwinding there. Well we haven't modified r1 here, so I thought it should be okay. It seems to do the right thing in mambo. > > +static inline void atomic_unlock_thread_idle(void) > > +{ > > + int cpu = raw_smp_processor_id(); > > + int first = cpu_first_thread_sibling(cpu); > > + unsigned long *state = &paca_ptrs[first]->idle_state; > > + > > + clear_bit_unlock(NR_PNV_CORE_IDLE_LOCK_BIT, state); > > +} > > + > > +static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on) > > We are not using the mmu_on anywhere in the code. Is this for > restoring the IR|DR bit in the MSR ? Hmm, it's supposed to be for KVM secondaries that have to have the MMU switched off around idle/wake. I haven't got all the KVM stuff right yet though. Thanks, Nick