On Fri, 07 Jul 2017 16:12:16 -0500 Benjamin Herrenschmidt <b...@kernel.crashing.org> wrote:
> When writing to the process table, we need to ensure the store is > visible to a subsequent access by the MMU. We assume we never have > the PID active while doing the update, so a ptesync/isync pair > should hopefully be a big enough hammer for our purpose. > Do we need this if it's going from invalid->valid? > Signed-off-by: Benjamin Herrenschmidt <b...@kernel.crashing.org> > --- > > Note: Architecturally, we also need to use a tlbie(l) with RIC=2 > to flush the process table cache. However this is (very) expensive > and we know that POWER9 will invalidate its cache when hitting the > mtpid instruction. > > To be safe, we should add the tlbie for any ARCH300 processor we > don't know about though. (Aneesh, Nick do we need a ftr bit ?) Good question, I'm not sure. Aside from this particular thing, it seems like a good idea in general to add implementation specific tests into the ftr framework. We could add the PVR into it so we don't have to pollute FTR bits. The POWER9_DD1 bit for example could just be a PVR mask and cmp. > > arch/powerpc/mm/mmu_context_book3s64.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/arch/powerpc/mm/mmu_context_book3s64.c > b/arch/powerpc/mm/mmu_context_book3s64.c > index 9404b5e..e3e2803 100644 > --- a/arch/powerpc/mm/mmu_context_book3s64.c > +++ b/arch/powerpc/mm/mmu_context_book3s64.c > @@ -138,6 +138,14 @@ static int radix__init_new_context(struct mm_struct *mm) > rts_field = radix__get_tree_size(); > process_tb[index].prtb0 = cpu_to_be64(rts_field | __pa(mm->pgd) | > RADIX_PGD_INDEX_SIZE); > > + /* > + * Order the above store with subsequent update of the PID > + * register (at which point HW can start loading/caching > + * the entry) and the corresponding load by the MMU from > + * the L2 cache. > + */ > + asm volatile("ptesync;isync" : : : "memory"); > + > mm->context.npu_context = NULL; > > return index; >