[RFC] Implementing HUGEPAGE on MPC 8xx
Hello, MPC 8xx has several page sizes: 4k, 16k, 512k and 8M. Today, 4k and 16k sizes are implemented as normal page sizes and 8M is used for mapping linear memory space in kernel. I'd like to implement HUGE PAGE to reduce TLB misses from user apps. In 4k mode, PAGE offset is 12 bits, PTE offset is 10 bits and PGD offset is 10 bits In 16k mode, PAGE offset is 14 bits, PTE offset is 12 bits and PGD offset is 6 bits In 4k mode, we could use 512k HUGE PAGE and have a HPAGE offset of 19 bits so HPTE offset of 3 bits and PGD offset of 10 bits In 16k mode, we could use both 512k HUGE PAGE and 8M HUGE PAGE and have: * For 512k: a HPAGE offset of 19 bits so HPTE offset of 7 bits and PGD offset of 6 bits * For 8M: a HPAGE offset of 23 bits so HPTE offset of 3 bits and PGD offset of 6 bits In see in the current ppc kernel that for PPC32, SYS_SUPPORTS_HUGETLBFS is selected only if we have PHYS_64BIT. What is the reason for only implementing HUGETLBFS with 64 bits phys addresses ? From your point of view, what would be the best approach to extend support of HUGE PAGES to PPC_8xx ? Would the good starting point be to implement a hugepagetlb-8xx.c from hugepagetlb-book3e.c ? Christophe ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH, RFC] cxl: Add support for CAPP DMA mode
Ian Munsie writes: > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > b/arch/powerpc/platforms/powernv/pci-ioda.c > index 3a5ea82..5a42e98 100644 > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -2793,7 +2793,9 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t > mode) > pe_info(pe, "Switching PHB to CXL\n"); > > rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number); > - if (rc) > + if (rc == OPAL_UNSUPPORTED) > + dev_err(&dev->dev, "Required cxl mode not supported by firmware > - update skiboot\n"); > + else if (rc) > dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode failed: > %i\n", rc); Could mention version required, which would be skiboot 5.3.x or higher. This could be something we start doing - there's enough random bits of functionality we could tell the user exactly what they have to upgrade to to have work. -- Stewart Smith OPAL Architect, IBM. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RESEND PATCH v2 0/6] vfio-pci: Add support for mmapping MSI-X table
Hi Yongji, Le 02/06/2016 à 08:09, Yongji Xie a écrit : > Current vfio-pci implementation disallows to mmap the page > containing MSI-X table in case that users can write directly > to MSI-X table and generate an incorrect MSIs. > > However, this will cause some performance issue when there > are some critical device registers in the same page as the > MSI-X table. We have to handle the mmio access to these > registers in QEMU emulation rather than in guest. > > To solve this issue, this series allows to expose MSI-X table > to userspace when hardware enables the capability of interrupt > remapping which can ensure that a given PCI device can only > shoot the MSIs assigned for it. And we introduce a new bus_flags > PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side > for different archs. > > The patch 3 are based on the proposed patchset[1]. You may have noticed I sent a respin of [1] yesterday: http://www.gossamer-threads.com/lists/linux/kernel/2455187. Unfortunately you will see I removed the patch defining the new msi_domain_info MSI_FLAG_IRQ_REMAPPING flag you rely on in this series. I did so because I was not using it anymore. At the beginning this was used to detect whether the MSI assignment was safe but this method was covering cases where the MSI controller was upstream to the IOMMU. So now I rely on a mechanism where MSI controller are supposed to register their MSI doorbells and tag whether it is safe. I don't know yet how this change will be welcomed though. Depending on reviews/discussions, might happen we revert to the previous flag. If you need the feature you can embed the used patches in your series and follow the review process separately. Sorry for the setback. Best Regards Eric > > Changelog v2: > - Make the commit log more clear > - Replace pci_bus_check_msi_remapping() with pci_bus_msi_isolated() > so that we could clearly know what the function does > - Set PCI_BUS_FLAGS_MSI_REMAP in pci_create_root_bus() instead > of iommu_bus_notifier() > - Reserve VFIO_REGION_INFO_FLAG_CAPS when we allow to mmap MSI-X > table so that we can know whether we allow to mmap MSI-X table > in QEMU > > [1] > https://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg1138820.html > > Yongji Xie (6): > PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag > PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller enables IRQ remapping > PCI: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping > iommu: Set PCI_BUS_FLAGS_MSI_REMAP on iommu driver initialization > pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge > vfio-pci: Allow to expose MSI-X table to userspace if interrupt remapping > is enabled > > arch/powerpc/platforms/powernv/pci-ioda.c |8 > drivers/iommu/iommu.c |8 > drivers/pci/msi.c | 15 +++ > drivers/pci/probe.c |7 +++ > drivers/vfio/pci/vfio_pci.c | 17 ++--- > drivers/vfio/pci/vfio_pci_rdwr.c |3 ++- > include/linux/msi.h |5 - > include/linux/pci.h |1 + > 8 files changed, 59 insertions(+), 5 deletions(-) > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 0/7] crypto: talitos - implementation of AEAD for SEC1
On Mon, Jun 06, 2016 at 01:20:31PM +0200, Christophe Leroy wrote: > This set of patches provides the implementation of AEAD for > talitos SEC1. All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RESEND PATCH v2 0/6] vfio-pci: Add support for mmapping MSI-X table
Hi, Eric On 2016/6/8 15:41, Auger Eric wrote: Hi Yongji, Le 02/06/2016 à 08:09, Yongji Xie a écrit : Current vfio-pci implementation disallows to mmap the page containing MSI-X table in case that users can write directly to MSI-X table and generate an incorrect MSIs. However, this will cause some performance issue when there are some critical device registers in the same page as the MSI-X table. We have to handle the mmio access to these registers in QEMU emulation rather than in guest. To solve this issue, this series allows to expose MSI-X table to userspace when hardware enables the capability of interrupt remapping which can ensure that a given PCI device can only shoot the MSIs assigned for it. And we introduce a new bus_flags PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side for different archs. The patch 3 are based on the proposed patchset[1]. You may have noticed I sent a respin of [1] yesterday: http://www.gossamer-threads.com/lists/linux/kernel/2455187. Unfortunately you will see I removed the patch defining the new msi_domain_info MSI_FLAG_IRQ_REMAPPING flag you rely on in this series. I did so because I was not using it anymore. At the beginning this was used to detect whether the MSI assignment was safe but this method was covering cases where the MSI controller was upstream to the IOMMU. So now I rely on a mechanism where MSI controller are supposed to register their MSI doorbells and tag whether it is safe. I don't know yet how this change will be welcomed though. Depending on reviews/discussions, might happen we revert to the previous flag. If you need the feature you can embed the used patches in your series and follow the review process separately. Sorry for the setback. Thanks for your notification. I'd better wait until your patches get settled. Then I could exactly know which way we should use to test the capability of interrupt remapping on ARM in my series. Thanks, Yongji ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 0/3] POWER9 Load Monitor Support
This patches series adds support for the POWER9 Load Monitor instruction (ldmx) based on work from Jack Miller. The first patch is a clean up of the FSCR handling. The second patch adds the actual ldmx support to the kernel. The third patch is a couple of ldmx selftests. v6: - PATCH 1/3: - Suggestions from mpe. - Init the FSCR using existing INIT_THREAD macro rather than init_fscr() function. - Set fscr when taking DSCR exception in facility_unavailable_exception(). - PATCH 2/3: - Remove erroneous semicolons in restore_sprs(). - PATCH 3/3: - no change. v5: - PATCH 1/3: - Change FSCR cleanup more extensive. - PATCH 2/3: - Moves FSCR_LM clearing to new init_fscr(). - PATCH 3/3: - Added test cases to .gitignore. - Removed test again PPC_FEATURE2_EBB since it's not needed. - Added parenthesis on input parameter usage for LDMX() macro. Jack Miller (2): powerpc: Load Monitor Register Support powerpc: Load Monitor Register Tests Michael Neuling (1): powerpc: Improve FSCR init and context switching ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 1/3] powerpc: Improve FSCR init and context switching
This fixes a few issues with FSCR init and switching. In this patch: powerpc: Create context switch helpers save_sprs() and restore_sprs() Author: Anton Blanchard commit 152d523e6307c7152f9986a542f873b5c5863937 We moved the setting of the FSCR register from inside an CPU_FTR_ARCH_207S section to inside just a CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where the FSCR doesn't exist. This is harmless but we shouldn't do it. Also, we can simplify the FSCR context switch. We don't need to go through the calculation involving dscr_inherit. We can just restore what we saved last time. Also, we currently don't explicitly init the FSCR for userspace applications. Currently we init FSCR on boot in __init_fscr: and then the first task inherits based on that. Currently it works but is delicate. This adds the initial fscr value to INIT_THREAD to explicitly set the FSCR for userspace applications and removes __init_fscr: boot time init. Based on patch by Jack Miller. Signed-off-by: Michael Neuling --- arch/powerpc/include/asm/processor.h | 1 + arch/powerpc/kernel/cpu_setup_power.S | 10 -- arch/powerpc/kernel/process.c | 12 arch/powerpc/kernel/traps.c | 3 ++- 4 files changed, 7 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index 009fab1..1833fe9 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -347,6 +347,7 @@ struct thread_struct { .fs = KERNEL_DS, \ .fpexc_mode = 0, \ .ppr = INIT_PPR, \ + .fscr = FSCR_TAR | FSCR_EBB \ } #endif diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S index 584e119..75f98c8 100644 --- a/arch/powerpc/kernel/cpu_setup_power.S +++ b/arch/powerpc/kernel/cpu_setup_power.S @@ -49,7 +49,6 @@ _GLOBAL(__restore_cpu_power7) _GLOBAL(__setup_cpu_power8) mflrr11 - bl __init_FSCR bl __init_PMU bl __init_hvmode_206 mtlrr11 @@ -67,7 +66,6 @@ _GLOBAL(__setup_cpu_power8) _GLOBAL(__restore_cpu_power8) mflrr11 - bl __init_FSCR bl __init_PMU mfmsr r3 rldicl. r0,r3,4,63 @@ -86,7 +84,6 @@ _GLOBAL(__restore_cpu_power8) _GLOBAL(__setup_cpu_power9) mflrr11 - bl __init_FSCR bl __init_hvmode_206 mtlrr11 beqlr @@ -102,7 +99,6 @@ _GLOBAL(__setup_cpu_power9) _GLOBAL(__restore_cpu_power9) mflrr11 - bl __init_FSCR mfmsr r3 rldicl. r0,r3,4,63 mtlrr11 @@ -155,12 +151,6 @@ __init_LPCR: isync blr -__init_FSCR: - mfspr r3,SPRN_FSCR - ori r3,r3,FSCR_TAR|FSCR_DSCR|FSCR_EBB - mtspr SPRN_FSCR,r3 - blr - __init_HFSCR: mfspr r3,SPRN_HFSCR ori r3,r3,HFSCR_TAR|HFSCR_TM|HFSCR_BHRB|HFSCR_PM|\ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index e2f12cb..74ea8db 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1023,18 +1023,11 @@ static inline void restore_sprs(struct thread_struct *old_thread, #ifdef CONFIG_PPC_BOOK3S_64 if (cpu_has_feature(CPU_FTR_DSCR)) { u64 dscr = get_paca()->dscr_default; - u64 fscr = old_thread->fscr & ~FSCR_DSCR; - - if (new_thread->dscr_inherit) { + if (new_thread->dscr_inherit) dscr = new_thread->dscr; - fscr |= FSCR_DSCR; - } if (old_thread->dscr != dscr) mtspr(SPRN_DSCR, dscr); - - if (old_thread->fscr != fscr) - mtspr(SPRN_FSCR, fscr); } if (cpu_has_feature(CPU_FTR_ARCH_207S)) { @@ -1045,6 +1038,9 @@ static inline void restore_sprs(struct thread_struct *old_thread, if (old_thread->ebbrr != new_thread->ebbrr) mtspr(SPRN_EBBRR, new_thread->ebbrr); + if (old_thread->fscr != new_thread->fscr) + mtspr(SPRN_FSCR, new_thread->fscr); + if (old_thread->tar != new_thread->tar) mtspr(SPRN_TAR, new_thread->tar); } diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 9229ba6..a4b00ee 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1418,7 +1418,8 @@ void facility_unavailable_exception(struct pt_regs *regs) rd = (instword >> 21) & 0x1f; current->thread.dscr = regs->gpr[rd]; current->thread.dscr_inherit = 1; - mtspr(SPRN_FSCR, value | FSCR_DSCR); + current->thread.fscr = value | FSCR_DSCR; + mtspr(SPRN_FSCR, current->thread.fscr); }
[PATCH v6 2/3] powerpc: Load Monitor Register Support
From: Jack Miller This enables new registers, LMRR and LMSER, that can trigger an EBB in userspace code when a monitored load (via the new ldmx instruction) loads memory from a monitored space. This facility is controlled by a new FSCR bit, LM. This patch disables the FSCR LM control bit on task init and enables that bit when a load monitor facility unavailable exception is taken for using it. On context switch, this bit is then used to determine whether the two relevant registers are saved and restored. This is done lazily for performance reasons. Signed-off-by: Jack Miller Signed-off-by: Michael Neuling --- arch/powerpc/include/asm/processor.h | 2 ++ arch/powerpc/include/asm/reg.h | 5 + arch/powerpc/kernel/process.c| 18 ++ arch/powerpc/kernel/traps.c | 4 4 files changed, 29 insertions(+) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index 1833fe9..ac7670d 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -314,6 +314,8 @@ struct thread_struct { unsigned long mmcr2; unsignedmmcr0; unsignedused_ebb; + unsigned long lmrr; + unsigned long lmser; #endif }; diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index a0948f4..ce44fe2 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -282,6 +282,8 @@ #define SPRN_HRMOR 0x139 /* Real mode offset register */ #define SPRN_HSRR0 0x13A /* Hypervisor Save/Restore 0 */ #define SPRN_HSRR1 0x13B /* Hypervisor Save/Restore 1 */ +#define SPRN_LMRR 0x32D /* Load Monitor Region Register */ +#define SPRN_LMSER 0x32E /* Load Monitor Section Enable Register */ #define SPRN_IC0x350 /* Virtual Instruction Count */ #define SPRN_VTB 0x351 /* Virtual Time Base */ #define SPRN_LDBAR 0x352 /* LD Base Address Register */ @@ -291,6 +293,7 @@ #define SPRN_PMCR 0x374 /* Power Management Control Register */ /* HFSCR and FSCR bit numbers are the same */ +#define FSCR_LM_LG 11 /* Enable Load Monitor Registers */ #define FSCR_TAR_LG8 /* Enable Target Address Register */ #define FSCR_EBB_LG7 /* Enable Event Based Branching */ #define FSCR_TM_LG 5 /* Enable Transactional Memory */ @@ -300,10 +303,12 @@ #define FSCR_VECVSX_LG 1 /* Enable VMX/VSX */ #define FSCR_FP_LG 0 /* Enable Floating Point */ #define SPRN_FSCR 0x099 /* Facility Status & Control Register */ +#define FSCR_LM __MASK(FSCR_LM_LG) #define FSCR_TAR __MASK(FSCR_TAR_LG) #define FSCR_EBB __MASK(FSCR_EBB_LG) #define FSCR_DSCR__MASK(FSCR_DSCR_LG) #define SPRN_HFSCR 0xbe/* HV=1 Facility Status & Control Register */ +#define HFSCR_LM __MASK(FSCR_LM_LG) #define HFSCR_TAR__MASK(FSCR_TAR_LG) #define HFSCR_EBB__MASK(FSCR_EBB_LG) #define HFSCR_TM __MASK(FSCR_TM_LG) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 74ea8db..2e22f60 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1009,6 +1009,14 @@ static inline void save_sprs(struct thread_struct *t) */ t->tar = mfspr(SPRN_TAR); } + + if (cpu_has_feature(CPU_FTR_ARCH_300)) { + /* Conditionally save Load Monitor registers, if enabled */ + if (t->fscr & FSCR_LM) { + t->lmrr = mfspr(SPRN_LMRR); + t->lmser = mfspr(SPRN_LMSER); + } + } #endif } @@ -1044,6 +1052,16 @@ static inline void restore_sprs(struct thread_struct *old_thread, if (old_thread->tar != new_thread->tar) mtspr(SPRN_TAR, new_thread->tar); } + + if (cpu_has_feature(CPU_FTR_ARCH_300)) { + /* Conditionally restore Load Monitor registers, if enabled */ + if (new_thread->fscr & FSCR_LM) { + if (old_thread->lmrr != new_thread->lmrr) + mtspr(SPRN_LMRR, new_thread->lmrr); + if (old_thread->lmser != new_thread->lmser) + mtspr(SPRN_LMSER, new_thread->lmser); + } + } #endif } diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index a4b00ee..aabdeac 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1376,6 +1376,7 @@ void facility_unavailable_exception(struct pt_regs *regs) [FSCR_TM_LG] = "TM", [FSCR_EBB_LG] = "EBB", [FSCR_TAR_LG] = "TAR", + [FSCR_LM_LG] = "LM", }; char *facility = "unknown"; u64 value; @@ -1433,6 +1434,9 @@ void facility_unavailable_exception(struct pt_regs *regs) emulate_single_step(re
[PATCH v6 3/3] powerpc: Load Monitor Register Tests
From: Jack Miller Adds two tests. One is a simple test to ensure that the new registers LMRR and LMSER are properly maintained. The other actually uses the existing EBB test infrastructure to test that LMRR and LMSER behave as documented. Signed-off-by: Jack Miller Signed-off-by: Michael Neuling --- tools/testing/selftests/powerpc/pmu/ebb/.gitignore | 2 + tools/testing/selftests/powerpc/pmu/ebb/Makefile | 2 +- tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c | 143 + tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h | 39 ++ .../selftests/powerpc/pmu/ebb/ebb_lmr_regs.c | 37 ++ tools/testing/selftests/powerpc/reg.h | 5 + 6 files changed, 227 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr_regs.c diff --git a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore index 42bddbe..44b7df1 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore +++ b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore @@ -20,3 +20,5 @@ back_to_back_ebbs_test lost_exception_test no_handler_test cycles_with_mmcr2_test +ebb_lmr +ebb_lmr_regs \ No newline at end of file diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile b/tools/testing/selftests/powerpc/pmu/ebb/Makefile index 8d2279c4..6b0453e 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile +++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile @@ -14,7 +14,7 @@ TEST_PROGS := reg_access_test event_attributes_test cycles_test \ fork_cleanup_test ebb_on_child_test\ ebb_on_willing_child_test back_to_back_ebbs_test \ lost_exception_test no_handler_test\ -cycles_with_mmcr2_test +cycles_with_mmcr2_test ebb_lmr ebb_lmr_regs all: $(TEST_PROGS) diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c new file mode 100644 index 000..c47ebd5 --- /dev/null +++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c @@ -0,0 +1,143 @@ +/* + * Copyright 2016, Jack Miller, IBM Corp. + * Licensed under GPLv2. + */ + +#include +#include + +#include "ebb.h" +#include "ebb_lmr.h" + +#define SIZE (32 * 1024 * 1024) /* 32M */ +#define LM_SIZE0 /* Smallest encoding, 32M */ + +#define SECTIONS 64 /* 1 per bit in LMSER */ +#define SECTION_SIZE (SIZE / SECTIONS) +#define SECTION_LONGS (SECTION_SIZE / sizeof(long)) + +static unsigned long *test_mem; + +static int lmr_count = 0; + +void ebb_lmr_handler(void) +{ + lmr_count++; +} + +void ldmx_full_section(unsigned long *mem, int section) +{ + unsigned long *ptr; + int i; + + for (i = 0; i < SECTION_LONGS; i++) { + ptr = &mem[(SECTION_LONGS * section) + i]; + ldmx((unsigned long) &ptr); + ebb_lmr_reset(); + } +} + +unsigned long section_masks[] = { + 0x8000, + 0xFF00, + 0x000F7000, + 0x8001, + 0xF0F0F0F0F0F0F0F0, + 0x0F0F0F0F0F0F0F0F, + 0x0 +}; + +int ebb_lmr_section_test(unsigned long *mem) +{ + unsigned long *mask = section_masks; + int i; + + for (; *mask; mask++) { + mtspr(SPRN_LMSER, *mask); + printf("Testing mask 0x%016lx\n", mfspr(SPRN_LMSER)); + + for (i = 0; i < 64; i++) { + lmr_count = 0; + ldmx_full_section(mem, i); + if (*mask & (1UL << (63 - i))) + FAIL_IF(lmr_count != SECTION_LONGS); + else + FAIL_IF(lmr_count); + } + } + + return 0; +} + +int ebb_lmr(void) +{ + int i; + + SKIP_IF(!lmr_is_supported()); + + setup_ebb_handler(ebb_lmr_handler); + + ebb_global_enable(); + + FAIL_IF(posix_memalign((void **)&test_mem, SIZE, SIZE) != 0); + + mtspr(SPRN_LMSER, 0); + + FAIL_IF(mfspr(SPRN_LMSER) != 0); + + mtspr(SPRN_LMRR, ((unsigned long)test_mem | LM_SIZE)); + + FAIL_IF(mfspr(SPRN_LMRR) != ((unsigned long)test_mem | LM_SIZE)); + + /* Read every single byte to ensure we get no false positives */ + for (i = 0; i < SECTIONS; i++) + ldmx_full_section(test_mem, i); + + FAIL_IF(lmr_count != 0); + + /* Turn on the first section */ + + mtspr(SPRN_LMSER, (1UL << 63)); + FAIL_IF(mfspr(SPRN_LMSER) != (1UL << 63)); + + /* Enable LM (BESCR) */ + + mtspr(SPRN_BESCR, mfspr(SPRN_BESCR) | BESCR_LME); + FAIL_IF(!(mfspr(SPRN_BESCR) & BESCR_LME)); + + ldmx((unsigned long)&test_mem); + + FAIL_IF(lmr_count != 1);
Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote: > On 31/05/16 20:32, Michael Ellerman wrote: > > On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote: > > > On 31.05.2016 12:04, Michael Ellerman wrote: > > > > On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote: > > > > > If we do not provide the PVR for POWER8NVL, a guest on this > > > > > system currently ends up in PowerISA 2.06 compatibility mode on > > > > > KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. > > > > > So some new instructions from POWER8 (like "mtvsrd") get disabled > > > > > for the guest, resulting in crashes when using code compiled > > > > > explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). > > > > > > > > > > Signed-off-by: Thomas Huth > > > > > > > > So this should say: > > > > > > > > Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor") > > > > > > > > And therefore: > > > > > > > > Cc: sta...@vger.kernel.org # v4.0+ > > > > > > > > Am I right? > > > > > > Right. (At least for virtualized systems ... for bare-metal systems, > > > that original patch was enough). So shall I resubmit my patch with these > > > two lines, or could you add them when you pick this patch up? > > > > Thanks, I'll add them here. > > Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well? Yep, patch sent this morning. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
On 08.06.2016 03:14, Balbir Singh wrote: > > On 31/05/16 20:32, Michael Ellerman wrote: >> On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote: >>> On 31.05.2016 12:04, Michael Ellerman wrote: On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote: > If we do not provide the PVR for POWER8NVL, a guest on this > system currently ends up in PowerISA 2.06 compatibility mode on > KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. > So some new instructions from POWER8 (like "mtvsrd") get disabled > for the guest, resulting in crashes when using code compiled > explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). > > Signed-off-by: Thomas Huth So this should say: Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor") And therefore: Cc: sta...@vger.kernel.org # v4.0+ Am I right? >>> >>> Right. (At least for virtualized systems ... for bare-metal systems, >>> that original patch was enough). So shall I resubmit my patch with these >>> two lines, or could you add them when you pick this patch up? >> >> Thanks, I'll add them here. > > Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well? D'oh! You're right, that needs to be changed, too! I'll send a fixup patch once I've tested it... By the way, there seems to be already a check for ibm_architecture_vec[IBM_ARCH_VEC_NRCORES_OFFSET] != NR_CPUS in prom_send_capabilities(), but it only prints out a warning which easily gets lost in the kernel log ... I wonder whether we should rather stop the boot there instead to catch this problem more easily? Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc: Fix IBM_ARCH_VEC_NRCORES_OFFSET value
On 08.06.2016 00:51, Benjamin Herrenschmidt wrote: > Commit 7cc851039d643a2ee7df4d18177150f2c3a484f5 > "powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support > call" > introduced a regression by adding fields to the beginning of the > ibm_architecture_vec structure without updating IBM_ARCH_VEC_NRCORES_OFFSET. > > This causes the kernel to print a warning at boot and to fail to adjust > the number of cores based on the number of threads before doing the CAS > call to firmware. > > This is quite a fragile piece of code sadly, we should try to find a way > to avoid that hard coded offset at some point, but for now this fixes it. > > Signed-off-by: Benjamin Herrenschmidt > --- > > diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c > index ccd2037..6ee4b72 100644 > --- a/arch/powerpc/kernel/prom_init.c > +++ b/arch/powerpc/kernel/prom_init.c > @@ -719,7 +719,7 @@ unsigned char ibm_architecture_vec[] = { >* must match by the macro below. Update the definition if >* the structure layout changes. >*/ > -#define IBM_ARCH_VEC_NRCORES_OFFSET 125 > +#define IBM_ARCH_VEC_NRCORES_OFFSET 133 > W(NR_CPUS), /* number of cores supported */ > 0, > 0, Yes, that should be the right offset now! Please also add "Cc: sta...@vger.kernel.org # v4.0+" to the patch since the commit 7cc851039d64 did have that as well. And sorry for breaking this! Reviewed-by: Thomas Huth ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] of: fix autoloading due to broken modalias with no 'compatible'
On Mon, 2016-06-06 at 18:48 +0200, Wolfram Sang wrote: > Because of an improper dereference, a stray 'C' character was output to > the modalias when no 'compatible' was specified. This is the case for > some old PowerMac drivers which only set the 'name' property. Fix it to > let them match again. > > Reported-by: Mathieu Malaterre > Signed-off-by: Wolfram Sang > Tested-by: Mathieu Malaterre > Cc: Philipp Zabel > Cc: Andreas Schwab > Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for > cross compiling") > --- > > I think it makes sense if this goes in via ppc (with stable tag added). > D'accord? Sure, I've grabbed it. I added: Cc: sta...@vger.kernel.org # v3.9+ cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
On 08.06.2016 12:44, Michael Ellerman wrote: > On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote: >> On 31/05/16 20:32, Michael Ellerman wrote: >>> On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote: On 31.05.2016 12:04, Michael Ellerman wrote: > On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote: >> If we do not provide the PVR for POWER8NVL, a guest on this >> system currently ends up in PowerISA 2.06 compatibility mode on >> KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. >> So some new instructions from POWER8 (like "mtvsrd") get disabled >> for the guest, resulting in crashes when using code compiled >> explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). >> >> Signed-off-by: Thomas Huth > > So this should say: > > Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor") > > And therefore: > > Cc: sta...@vger.kernel.org # v4.0+ > > Am I right? Right. (At least for virtualized systems ... for bare-metal systems, that original patch was enough). So shall I resubmit my patch with these two lines, or could you add them when you pick this patch up? >>> >>> Thanks, I'll add them here. >> >> Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well? > > Yep, patch sent this morning. Ok, looks like BenH already posted a patch ... anyway, what do you think about aborting the boot process here in case cores != NR_CPUS, rather than just printing out a small warning which can easily get lost in the kernel log? Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V10 00/28] Add new powerpc specific ELF core notes
On Mon, 2016-06-06 at 14:27 +0530, Anshuman Khandual wrote: > On 06/03/2016 03:56 AM, Cyril Bur wrote: > > > > At the moment is is rather confusing since pt_regs is the always the 'live' > > state and theres a ckpt_regs that is the pt_regs for the checkpointed state. > > FPU/VMX/VSX is done differently which is really only creating confusion so > > I'm changing > > it to do the same at for pt_regs/ckpt_regs. Ultimately this is part of more > > work from me > > But that changes the basic semantics on which this ptrace series is written. > With this change, a significant part of the ptrace series has to be changed. Yes, that's the whole point. In fact half of the code should vanish, because the only difference between copying the live or checkpointed state out to userspace should be which regs struct you pass to the function. > Its just an improvement on how we store running and check pointed values for > FP/VSX/VMX registers inside the kernel. How does it improve ptrace interface > from the user point of view ? If not, then why this change is necessary for > the acceptance of this patch series ? Because the clean-ups never happen once a series is merged, and I'm left to deal with it. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()
On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote: > On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote: > > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote: > > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote: > > > > > > Why the move to in-line this implementation? It looks like a fairly big > > > function. > > > > I agree it's not pretty. > > > I'm not beholden to v3 though if you hate it. > > I don't mind; its just that I am in a similar boat with qspinlock and > chose the other option. So I just figured I'd ask :-) OK. I'll go with inline and we'll see which version gets "cleaned-up" by a janitor first ;) cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
On Wed, 2016-06-08 at 13:17 +0200, Thomas Huth wrote: > On 08.06.2016 12:44, Michael Ellerman wrote: > > On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote: > > > Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well? > > > > Yep, patch sent this morning. > > Ok, looks like BenH already posted a patch ... And me before him :) To be clear I'm not blaming you in any way for this, the existing code is terrible and incredibly fragile. > anyway, what do you think about aborting the boot process here in case cores > != NR_CPUS, rather than just printing out a small warning which can easily get > lost in the kernel log? Yeah I agree it's easy to miss. And it's not part of dmesg (because it's from prom_init()), so you *only* see it if you're actually staring at the console as it boots (which is why my boot tests missed it). I actually have plans to rewrite the whole thing to make it robust, so that should avoid it ever being a problem again. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
On Wed, 2016-06-08 at 12:58 +0200, Christian Zigotzky wrote: > On 08 June 2016 at 04:52 AM, Michael Ellerman wrote: > > On Tue, 2016-06-07 at 22:17 +0200, Christian Zigotzky wrote: > > > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit > > > commit 764041e0f43cc7846f6d8eb246d65b53cc06c764 > > > Author: Aneesh Kumar K.V > > > Date: Fri Apr 29 23:26:09 2016 +1000 > > > > > > powerpc/mm/radix: Add checks in slice code to catch radix usage > > > > > > Radix doesn't need slice support. Catch incorrect usage of slice code > > > when radix is enabled. > > > > > > Signed-off-by: Aneesh Kumar K.V > > > Signed-off-by: Michael Ellerman > > > > > Hmm, I find that hard to believe. But maybe I'm missing something. > > > > Can you checkout Linus' master and then revert that commit? > > > $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > linux-git > $ git checkout > Your branch is up-to-date with 'origin/master'. > > $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 -m 1 > error: Mainline was specified but commit > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is not a merge. > fatal: revert failed > > How can I checkout Linus' master and then revert that commit? It's not a merge, so just plain git revert: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git $ cd linux-git $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 [master 5dd9737a173e] Revert "powerpc/mm/radix: Add checks in slice code to catch radix usage" 1 file changed, 16 deletions(-) cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 8/8] dmaengine: Remove site specific OOM error messages on kzalloc
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin wrote: > If kzalloc() fails it will issue it's own error message including > a dump_stack(). So remove the site specific error messages. > > Signed-off-by: Peter Griffin Acked-by: Linus Walleij A few subsystems may use a cleanup like this... I wonder how many unnecessary prints I've introduced myself :P Yours, Linus Walleij ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/8] dmaengine: coh901318: Only calculate residue if txstate exists.
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin wrote: > There is no point in calculating the residue if there is no > txstate to store the value. > > Signed-off-by: Peter Griffin Acked-by: Linus Walleij Yours, Linus Walleij ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 8/8] dmaengine: Remove site specific OOM error messages on kzalloc
On 07/06/16 18:38, Peter Griffin wrote: > If kzalloc() fails it will issue it's own error message including > a dump_stack(). So remove the site specific error messages. > > Signed-off-by: Peter Griffin > --- > drivers/dma/amba-pl08x.c| 10 +- > drivers/dma/bestcomm/bestcomm.c | 2 -- > drivers/dma/edma.c | 16 > drivers/dma/fsldma.c| 2 -- > drivers/dma/k3dma.c | 10 -- > drivers/dma/mmp_tdma.c | 5 ++--- > drivers/dma/moxart-dma.c| 4 +--- > drivers/dma/nbpfaxi.c | 5 ++--- > drivers/dma/pl330.c | 5 + > drivers/dma/ppc4xx/adma.c | 2 -- > drivers/dma/s3c24xx-dma.c | 5 + > drivers/dma/sh/shdmac.c | 9 ++--- > drivers/dma/sh/sudmac.c | 9 ++--- > drivers/dma/sirf-dma.c | 5 ++--- > drivers/dma/ste_dma40.c | 4 +--- > drivers/dma/tegra20-apb-dma.c | 11 +++ > drivers/dma/timb_dma.c | 8 ++-- > 17 files changed, 28 insertions(+), 84 deletions(-) [snip] > diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c > index 7f4af8c..032884f 100644 > --- a/drivers/dma/tegra20-apb-dma.c > +++ b/drivers/dma/tegra20-apb-dma.c > @@ -300,10 +300,8 @@ static struct tegra_dma_desc *tegra_dma_desc_get( > > /* Allocate DMA desc */ > dma_desc = kzalloc(sizeof(*dma_desc), GFP_NOWAIT); > - if (!dma_desc) { > - dev_err(tdc2dev(tdc), "dma_desc alloc failed\n"); > + if (!dma_desc) > return NULL; > - } > > dma_async_tx_descriptor_init(&dma_desc->txd, &tdc->dma_chan); > dma_desc->txd.tx_submit = tegra_dma_tx_submit; > @@ -340,8 +338,7 @@ static struct tegra_dma_sg_req *tegra_dma_sg_req_get( > spin_unlock_irqrestore(&tdc->lock, flags); > > sg_req = kzalloc(sizeof(struct tegra_dma_sg_req), GFP_NOWAIT); > - if (!sg_req) > - dev_err(tdc2dev(tdc), "sg_req alloc failed\n"); > + > return sg_req; > } > > @@ -1319,10 +1316,8 @@ static int tegra_dma_probe(struct platform_device > *pdev) > > tdma = devm_kzalloc(&pdev->dev, sizeof(*tdma) + cdata->nr_channels * > sizeof(struct tegra_dma_channel), GFP_KERNEL); > - if (!tdma) { > - dev_err(&pdev->dev, "Error: memory allocation failed\n"); > + if (!tdma) > return -ENOMEM; > - } > > tdma->dev = &pdev->dev; > tdma->chip_data = cdata; For the tegra portion ... Acked-by: Jon Hunter Cheers Jon -- nvpublic ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/8] dmaengine: tegra20-apb-dma: Only calculate residue if txstate exists.
Hi Peter, On 07/06/16 18:38, Peter Griffin wrote: > There is no point calculating the residue if there is > no txstate to store the value. > > Signed-off-by: Peter Griffin > --- > drivers/dma/tegra20-apb-dma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c > index 01e316f..7f4af8c 100644 > --- a/drivers/dma/tegra20-apb-dma.c > +++ b/drivers/dma/tegra20-apb-dma.c > @@ -814,7 +814,7 @@ static enum dma_status tegra_dma_tx_status(struct > dma_chan *dc, > unsigned int residual; > > ret = dma_cookie_status(dc, cookie, txstate); > - if (ret == DMA_COMPLETE) > + if (ret == DMA_COMPLETE || !txstate) > return ret; Thanks for reporting this. I agree that we should not do this, however, looking at the code for Tegra, I am wondering if this could change the actual state that is returned. Looking at dma_cookie_status() it will call dma_async_is_complete() which will return either DMA_COMPLETE or DMA_IN_PROGRESS. It could be possible that the actual state for the DMA transfer in the tegra driver is DMA_ERROR, so I am wondering if we should do something like the following ... diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c index 01e316f73559..45edab7418d0 100644 --- a/drivers/dma/tegra20-apb-dma.c +++ b/drivers/dma/tegra20-apb-dma.c @@ -822,13 +822,8 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan *dc, /* Check on wait_ack desc status */ list_for_each_entry(dma_desc, &tdc->free_dma_desc, node) { if (dma_desc->txd.cookie == cookie) { - residual = dma_desc->bytes_requested - - (dma_desc->bytes_transferred % - dma_desc->bytes_requested); - dma_set_residue(txstate, residual); ret = dma_desc->dma_status; - spin_unlock_irqrestore(&tdc->lock, flags); - return ret; + goto found; } } @@ -836,17 +831,23 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan *dc, list_for_each_entry(sg_req, &tdc->pending_sg_req, node) { dma_desc = sg_req->dma_desc; if (dma_desc->txd.cookie == cookie) { - residual = dma_desc->bytes_requested - - (dma_desc->bytes_transferred % - dma_desc->bytes_requested); - dma_set_residue(txstate, residual); ret = dma_desc->dma_status; - spin_unlock_irqrestore(&tdc->lock, flags); - return ret; + goto found; } } - dev_dbg(tdc2dev(tdc), "cookie %d does not found\n", cookie); + dev_warn(tdc2dev(tdc), "cookie %d not found\n", cookie); + spin_unlock_irqrestore(&tdc->lock, flags); + return ret; + +found: + if (txstate) { + residual = dma_desc->bytes_requested - + (dma_desc->bytes_transferred % + dma_desc->bytes_requested); + dma_set_residue(txstate, residual); + } + spin_unlock_irqrestore(&tdc->lock, flags); return ret; } Cheers Jon -- nvpublic ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] drivers/net/fsl_ucc: Do not prefix header guard with CONFIG_
The CONFIG_ prefix should only be used for options which can be configured through Kconfig and not for guarding headers. Signed-off-by: Andreas Ziegler --- drivers/net/wan/fsl_ucc_hdlc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wan/fsl_ucc_hdlc.h b/drivers/net/wan/fsl_ucc_hdlc.h index 525786a..881ecde 100644 --- a/drivers/net/wan/fsl_ucc_hdlc.h +++ b/drivers/net/wan/fsl_ucc_hdlc.h @@ -8,8 +8,8 @@ * option) any later version. */ -#ifndef CONFIG_UCC_HDLC_H -#define CONFIG_UCC_HDLC_H +#ifndef _UCC_HDLC_H_ +#define _UCC_HDLC_H_ #include #include -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/nvram: remove unused pstore headers
Since the pstore code has moved away from nvram.c, remove unused pstore headers pstore.h and kmsg_dump.h. Signed-off-by: Geliang Tang --- arch/powerpc/platforms/pseries/nvram.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/nvram.c b/arch/powerpc/platforms/pseries/nvram.c index 9f818417..79aef8c 100644 --- a/arch/powerpc/platforms/pseries/nvram.c +++ b/arch/powerpc/platforms/pseries/nvram.c @@ -17,8 +17,6 @@ #include #include #include -#include -#include #include #include #include -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hi Michael, On 08 June 2016 at 04:52 AM, Michael Ellerman wrote: On Tue, 2016-06-07 at 22:17 +0200, Christian Zigotzky wrote: 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit commit 764041e0f43cc7846f6d8eb246d65b53cc06c764 Author: Aneesh Kumar K.V Date: Fri Apr 29 23:26:09 2016 +1000 powerpc/mm/radix: Add checks in slice code to catch radix usage Radix doesn't need slice support. Catch incorrect usage of slice code when radix is enabled. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Hmm, I find that hard to believe. But maybe I'm missing something. Can you checkout Linus' master and then revert that commit? cheers $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git $ git checkout Your branch is up-to-date with 'origin/master'. $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 -m 1 error: Mainline was specified but commit 764041e0f43cc7846f6d8eb246d65b53cc06c764 is not a merge. fatal: revert failed How can I checkout Linus' master and then revert that commit? Cheers, Christian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/8] dmaengine: ste_dma40: Only calculate residue if txstate exists.
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin wrote: > There is no point calculating the residue if there is > no txstate to store the value. > > Signed-off-by: Peter Griffin Acked-by: Linus Walleij Yours, Linus Walleij ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()
On Wed, Jun 08, 2016 at 09:20:45PM +1000, Michael Ellerman wrote: > On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote: > > On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote: > > > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote: > > > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote: > > > > > > > > Why the move to in-line this implementation? It looks like a fairly big > > > > function. > > > > > > I agree it's not pretty. > > > > > I'm not beholden to v3 though if you hate it. > > > > I don't mind; its just that I am in a similar boat with qspinlock and > > chose the other option. So I just figured I'd ask :-) > > OK. I'll go with inline and we'll see which version gets "cleaned-up" by a > janitor first ;) Ok; what tree does this go in? I have this dependent series which I'd like to get sorted and merged somewhere. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hi Michael, Thanks a lot for the hint. I compiled it without the commit below but unfortunately it doesn't boot. Cheers, Christian On 08 June 2016 at 1:30 PM, Michael Ellerman wrote: It's not a merge, so just plain git revert: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git $ cd linux-git $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 [master 5dd9737a173e] Revert "powerpc/mm/radix: Add checks in slice code to catch radix usage" 1 file changed, 16 deletions(-) cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/nohash: Fix build break with 4K pages
Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are allocating fragments" renamed page_table_free() to pte_fragment_free(). One occurrence was mistyped as pte_fragment_fre(). This only breaks the nohash 4K page build, which is not the default or enabled in any defconfig. Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are allocating fragments") Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h index 0c12a3bfe2ab..069369f6414b 100644 --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h @@ -172,7 +172,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm, static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) { - pte_fragment_fre((unsigned long *)pte, 1); + pte_fragment_free((unsigned long *)pte, 1); } static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hi Darren, Many thanks for your help. I started my bisect with the following commits: git bisect start git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e git bisect bad 1a695a905c18548062509178b98bc91e67510864 (Linux 4.7-rc1) Did you start your bisect with the same bad and good commit? I will revert your bad commit and compile a new test kernel. Thanks, Christian On 08 June 2016 at 1:33 PM, Darren Stevens wrote: Hello Christian That's not where I ended up with my bisect, this commit is about 10 before the one I found to be bad, which is: commit d6a9996e84ac4beb7713e9485f4563e100a9b03e Author: Aneesh Kumar K.V Date: Fri Apr 29 23:26:21 2016 +1000 powerpc/mm: vmalloc abstraction in preparation for radix The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman Not sure how we are getting different results though. I have attached my bisect log and the suspect commit, whcih is quite large. I'm not sure which part of it is at fault. I have some jobs to do now, but hope to get tesing this later today. Regards Darren ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hello Christian On 07/06/2016, Christian Zigotzky wrote: > "range.size, pgprot_val(pgprot_noncached(__pgprot(0;" isn't the > problem. :-) It works. > > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit > commit 764041e0f43cc7846f6d8eb246d65b53cc06c764 > Author: Aneesh Kumar K.V > Date: Fri Apr 29 23:26:09 2016 +1000 > > powerpc/mm/radix: Add checks in slice code to catch radix usage > > Radix doesn't need slice support. Catch incorrect usage of slice > code when radix is enabled. > > Signed-off-by: Aneesh Kumar K.V > Signed-off-by: Michael Ellerman > That's not where I ended up with my bisect, this commit is about 10 before the one I found to be bad, which is: commit d6a9996e84ac4beb7713e9485f4563e100a9b03e Author: Aneesh Kumar K.V Date: Fri Apr 29 23:26:21 2016 +1000 powerpc/mm: vmalloc abstraction in preparation for radix The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman Not sure how we are getting different results though. I have attached my bisect log and the suspect commit, whcih is quite large. I'm not sure which part of it is at fault. I have some jobs to do now, but hope to get tesing this later today. Regards Darren git bisect start # bad: [ed2608faa0f701b1dbc65277a9e5c7ff7118bfd4] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input git bisect bad ed2608faa0f701b1dbc65277a9e5c7ff7118bfd4 # good: [8ffb4103f5e28d7e7890ed4774d8e009f253f56e] IB/qib: Use cache inhibitted and guarded mapping on powerpc git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e # good: [801faf0db8947e01877920e848a4d338dd7a99e7] mm/slab: lockless decision to grow cache git bisect good 801faf0db8947e01877920e848a4d338dd7a99e7 # bad: [2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac] Merge tag 'staging-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging git bisect bad 2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac # bad: [be1332c0994fbf016fa4ef0f0c4acda566fe6cb3] Merge tag 'gfs2-4.7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 git bisect bad be1332c0994fbf016fa4ef0f0c4acda566fe6cb3 # good: [f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752] Merge tag 'sound-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect good f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752 # good: [a1c28b75a95808161cacbb3531c418abe248994e] Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm git bisect good a1c28b75a95808161cacbb3531c418abe248994e # bad: [6eb59af580dcffc6f6982ac8ef6d27a1a5f26b27] Merge tag 'mfd-for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd git bisect bad 6eb59af580dcffc6f6982ac8ef6d27a1a5f26b27 # bad: [4fad494321351f0ac412945c6a464109ad96734a] powerpc/powernv: Simplify pnv_eeh_reset() git bisect bad 4fad494321351f0ac412945c6a464109ad96734a # bad: [43a5c684270ee9b5b13c91ec048831dd5b7e0cdc] powerpc/mm/radix: Make sure swapper pgdir is properly aligned git bisect bad 43a5c684270ee9b5b13c91ec048831dd5b7e0cdc # good: [a9252aaefe7e72133e7a37e0eff4e950a4f33af1] powerpc/mm: Move hugetlb and THP related pmd accessors to pgtable.h git bisect good a9252aaefe7e72133e7a37e0eff4e950a4f33af1 # good: [177ba7c647f37bc3f31667192059ee794347d79d] powerpc/mm/radix: Limit paca allocation in radix git bisect good 177ba7c647f37bc3f31667192059ee794347d79d # good: [934828edfadc43be07e53429ce501741bedf4a5e] powerpc/mm: Make 4K and 64K use pte_t for pgtable_t git bisect good 934828edfadc43be07e53429ce501741bedf4a5e # bad: [a3dece6d69b0ad21b64104dff508c67a1a1f14dd] powerpc/radix: Update MMU cache git bisect bad a3dece6d69b0ad21b64104dff508c67a1a1f14dd # good: [4dfb88ca9b66690d21030ccacc1cca73db90655e] powerpc/mm: Update pte filter for radix git bisect good 4dfb88ca9b66690d21030ccacc1cca73db90655e # bad: [d6a9996e84ac4beb7713e9485f4563e100a9b03e] powerpc/mm: vmalloc abstraction in preparation for radix git bisect bad d6a9996e84ac4beb7713e9485f4563e100a9b03e commit d6a9996e84ac4beb7713e9485f4563e100a9b03e Author: Aneesh Kumar K.V Date: Fri Apr 29 23:26:21 2016 +1000 powerpc/mm: vmalloc abstraction in preparation for radix The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index cd3e915..f61cad3 100644 --- a/arch/powerpc/include/asm/book3s/64/hash
Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()
On Wed, 2016-06-08 at 14:35 +0200, Peter Zijlstra wrote: > On Wed, Jun 08, 2016 at 09:20:45PM +1000, Michael Ellerman wrote: > > On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote: > > > On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote: > > > > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote: > > > > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote: > > > > > > > > > > Why the move to in-line this implementation? It looks like a fairly > > > > > big > > > > > function. > > > > > > > > I agree it's not pretty. > > > > > > > I'm not beholden to v3 though if you hate it. > > > > > > I don't mind; its just that I am in a similar boat with qspinlock and > > > chose the other option. So I just figured I'd ask :-) > > > > OK. I'll go with inline and we'll see which version gets "cleaned-up" by a > > janitor first ;) > > Ok; what tree does this go in? I have this dependent series which I'd > like to get sorted and merged somewhere. Ah sorry, I didn't realise. I was going to put it in my next (which doesn't exist yet but hopefully will early next week). I'll make a topic branch with just that commit based on rc2 or rc3? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hi All, I tried to revert this commit but unfortunately I doesn't work: git revert d6a9996e84ac4beb7713e9485f4563e100a9b03e error: could not revert d6a9996... powerpc/mm: vmalloc abstraction in preparation for radix hint: after resolving the conflicts, mark the corrected paths hint: with 'git add ' or 'git rm ' hint: and commit the result with 'git commit' Any hints? Thanks, Christian On 08 June 2016 at 1:33 PM, Darren Stevens wrote: Hello Christian That's not where I ended up with my bisect, this commit is about 10 before the one I found to be bad, which is: commit d6a9996e84ac4beb7713e9485f4563e100a9b03e Author: Aneesh Kumar K.V Date: Fri Apr 29 23:26:21 2016 +1000 powerpc/mm: vmalloc abstraction in preparation for radix The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman Not sure how we are getting different results though. I have attached my bisect log and the suspect commit, whcih is quite large. I'm not sure which part of it is at fault. I have some jobs to do now, but hope to get tesing this later today. Regards Darren ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
On Wed, 2016-06-08 at 12:33 +0100, Darren Stevens wrote: > On 07/06/2016, Christian Zigotzky wrote: > > > > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit > > commit 764041e0f43cc7846f6d8eb246d65b53cc06c764 > > Author: Aneesh Kumar K.V > > Date: Fri Apr 29 23:26:09 2016 +1000 > > > > powerpc/mm/radix: Add checks in slice code to catch radix usage > > > > That's not where I ended up with my bisect, this commit is about 10 before the > one I found to be bad, which is: > > commit d6a9996e84ac4beb7713e9485f4563e100a9b03e > Author: Aneesh Kumar K.V > Date: Fri Apr 29 23:26:21 2016 +1000 > > powerpc/mm: vmalloc abstraction in preparation for radix > > The vmalloc range differs between hash and radix config. Hence make > VMALLOC_START and related constants a variable which will be runtime > initialized depending on whether hash or radix mode is active. > > Signed-off-by: Aneesh Kumar K.V > [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] > Signed-off-by: Michael Ellerman > > Not sure how we are getting different results though. I have attached my > bisect log and the suspect commit, whcih is quite large. I'm not sure which > part of it is at fault. I have some jobs to do now, but hope to get tesing > this later today. That one is more likely to be the problem, though I don't see anything glaringly wrong with it. Does your patch use any of the constants that are changed in that file? They now aren't constants, they're initialised at boot, so if you use them too early you'll get junk. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()
On Wed, Jun 08, 2016 at 11:49:20PM +1000, Michael Ellerman wrote: > > Ok; what tree does this go in? I have this dependent series which I'd > > like to get sorted and merged somewhere. > > Ah sorry, I didn't realise. I was going to put it in my next (which doesn't > exist yet but hopefully will early next week). > > I'll make a topic branch with just that commit based on rc2 or rc3? Works for me; thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 01/10] Fix .long's in mm/tlb-radix.c to more meaningful
From: Balbir Singh The .longs with the shifts are harder to read, use more meaningful names for the opcodes. PPC_TLBIE_5 is introduced for the 5 opcode variation of the instruction due to an existing op-code for the 2 opcode variant Signed-off-by: Balbir Singh Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/ppc-opcode.h | 14 ++ arch/powerpc/mm/tlb-radix.c | 13 + 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index 1d035c1cc889..c0e9ea44fee3 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -184,6 +184,7 @@ #define PPC_INST_STSWX 0x7c00052a #define PPC_INST_STXVD2X 0x7c000798 #define PPC_INST_TLBIE 0x7c000264 +#define PPC_INST_TLBIEL0x7c000224 #define PPC_INST_TLBILX0x7c24 #define PPC_INST_WAIT 0x7c7c #define PPC_INST_TLBIVAX 0x7c000624 @@ -257,6 +258,9 @@ #define ___PPC_RB(b) (((b) & 0x1f) << 11) #define ___PPC_RS(s) (((s) & 0x1f) << 21) #define ___PPC_RT(t) ___PPC_RS(t) +#define ___PPC_R(r)(((r) & 0x1) << 16) +#define ___PPC_PRS(prs)(((prs) & 0x1) << 17) +#define ___PPC_RIC(ric)(((ric) & 0x3) << 18) #define __PPC_RA(a)___PPC_RA(__REG_##a) #define __PPC_RA0(a) ___PPC_RA(__REGA0_##a) #define __PPC_RB(b)___PPC_RB(__REG_##b) @@ -321,6 +325,16 @@ __PPC_WC(w)) #define PPC_TLBIE(lp,a)stringify_in_c(.long PPC_INST_TLBIE | \ ___PPC_RB(a) | ___PPC_RS(lp)) +#definePPC_TLBIE_5(rb,rs,ric,prs,r) \ + stringify_in_c(.long PPC_INST_TLBIE | \ + ___PPC_RB(rb) | ___PPC_RS(rs) | \ + ___PPC_RIC(ric) | ___PPC_PRS(prs) | \ + ___PPC_R(r)) +#definePPC_TLBIEL(rb,rs,ric,prs,r) \ + stringify_in_c(.long PPC_INST_TLBIEL | \ + ___PPC_RB(rb) | ___PPC_RS(rs) | \ + ___PPC_RIC(ric) | ___PPC_PRS(prs) | \ + ___PPC_R(r)) #define PPC_TLBSRX_DOT(a,b)stringify_in_c(.long PPC_INST_TLBSRX_DOT | \ __PPC_RA0(a) | __PPC_RB(b)) #define PPC_TLBIVAX(a,b) stringify_in_c(.long PPC_INST_TLBIVAX | \ diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 0fdaf93a3e09..e6b7487ad28f 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -30,8 +31,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set) ric = 2; /* invalidate all the caches */ asm volatile("ptesync": : :"memory"); - asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |" -"(%2 << 17) | (%3 << 18) | (%4 << 21)" + asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); asm volatile("ptesync": : :"memory"); } @@ -60,8 +60,7 @@ static inline void _tlbie_pid(unsigned long pid) ric = 2; /* invalidate all the caches */ asm volatile("ptesync": : :"memory"); - asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |" -"(%2 << 17) | (%3 << 18) | (%4 << 21)" + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); asm volatile("eieio; tlbsync; ptesync": : :"memory"); } @@ -79,8 +78,7 @@ static inline void _tlbiel_va(unsigned long va, unsigned long pid, ric = 0; /* no cluster flush yet */ asm volatile("ptesync": : :"memory"); - asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |" -"(%2 << 17) | (%3 << 18) | (%4 << 21)" + asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); asm volatile("ptesync": : :"memory"); } @@ -98,8 +96,7 @@ static inline void _tlbie_va(unsigned long va, unsigned long pid, ric = 0; /* no cluster flush yet */ asm volatile("ptesync": : :"memory"); - asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |" -"(%2 << 17) | (%3 << 18) | (%4 << 21)" + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); asm volatile("eieio; tlbsync; ptesync": : :"memory"); } -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/
[PATCH V2 00/10] Fixes for Radix support
Hi Michael, This series includes patches I had posted before. I collected them in a series and marked the series V2. This address the review feedback I received from last post. Aneesh Kumar K.V (9): powerpc/mm/radix: Update to tlb functions ric argument powerpc/mm/radix: Flush page walk cache when freeing page table powerpc/mm/radix: Update LPCR HR bit as per ISA powerpc/mm: use _raw variant of page table accessors powerpc/mm: Compile out radix related functions if RADIX_MMU is disabled powerpc/hash: Use the correct ppp mask when updating hpte powerpc/mm: Clear top 16 bits of va only on older cpus powerpc/mm: Print formation regarding the the MMU mode powerpc/mm/hash: Update SDR1 size encoding as documented in ISA 3.0 Balbir Singh (1): Fix .long's in mm/tlb-radix.c to more meaningful arch/powerpc/include/asm/book3s/32/pgalloc.h | 1 - arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 + arch/powerpc/include/asm/book3s/64/mmu.h | 5 ++ arch/powerpc/include/asm/book3s/64/pgalloc.h | 16 +++- arch/powerpc/include/asm/book3s/64/pgtable-4k.h| 6 +- arch/powerpc/include/asm/book3s/64/pgtable-64k.h | 6 +- arch/powerpc/include/asm/book3s/64/pgtable.h | 99 +++--- .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 3 + arch/powerpc/include/asm/book3s/64/tlbflush.h | 15 +++- arch/powerpc/include/asm/book3s/pgalloc.h | 5 -- arch/powerpc/include/asm/mmu.h | 12 ++- arch/powerpc/include/asm/pgtable-be-types.h| 15 arch/powerpc/include/asm/ppc-opcode.h | 14 +++ arch/powerpc/include/asm/reg.h | 1 + arch/powerpc/mm/hash_native_64.c | 14 +-- arch/powerpc/mm/hash_utils_64.c| 12 +-- arch/powerpc/mm/pgtable-radix.c| 7 +- arch/powerpc/mm/tlb-radix.c| 94 +--- 18 files changed, 236 insertions(+), 90 deletions(-) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 02/10] powerpc/mm/radix: Update to tlb functions ric argument
Radix invalidate control (RIC) is used to control which cache to flush using tlb instructions. When doing a PID flush, we currently flush everything including page walk cache. For address range flush, we flush only the TLB. In the later patch, we add support for flushing only page walk cache. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/tlb-radix.c | 43 ++- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index e6b7487ad28f..b33b7c77cfa3 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -19,16 +19,20 @@ static DEFINE_RAW_SPINLOCK(native_tlbie_lock); -static inline void __tlbiel_pid(unsigned long pid, int set) +#define RIC_FLUSH_TLB 0 +#define RIC_FLUSH_PWC 1 +#define RIC_FLUSH_ALL 2 + +static inline void __tlbiel_pid(unsigned long pid, int set, + unsigned long ric) { - unsigned long rb,rs,ric,prs,r; + unsigned long rb,rs,prs,r; rb = PPC_BIT(53); /* IS = 1 */ rb |= set << PPC_BITLSHIFT(51); rs = ((unsigned long)pid) << PPC_BITLSHIFT(31); prs = 1; /* process scoped */ r = 1; /* raidx format */ - ric = 2; /* invalidate all the caches */ asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) @@ -39,25 +43,24 @@ static inline void __tlbiel_pid(unsigned long pid, int set) /* * We use 128 set in radix mode and 256 set in hpt mode. */ -static inline void _tlbiel_pid(unsigned long pid) +static inline void _tlbiel_pid(unsigned long pid, unsigned long ric) { int set; for (set = 0; set < POWER9_TLB_SETS_RADIX ; set++) { - __tlbiel_pid(pid, set); + __tlbiel_pid(pid, set, ric); } return; } -static inline void _tlbie_pid(unsigned long pid) +static inline void _tlbie_pid(unsigned long pid, unsigned long ric) { - unsigned long rb,rs,ric,prs,r; + unsigned long rb,rs,prs,r; rb = PPC_BIT(53); /* IS = 1 */ rs = pid << PPC_BITLSHIFT(31); prs = 1; /* process scoped */ r = 1; /* raidx format */ - ric = 2; /* invalidate all the caches */ asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) @@ -66,16 +69,15 @@ static inline void _tlbie_pid(unsigned long pid) } static inline void _tlbiel_va(unsigned long va, unsigned long pid, - unsigned long ap) + unsigned long ap, unsigned long ric) { - unsigned long rb,rs,ric,prs,r; + unsigned long rb,rs,prs,r; rb = va & ~(PPC_BITMASK(52, 63)); rb |= ap << PPC_BITLSHIFT(58); rs = pid << PPC_BITLSHIFT(31); prs = 1; /* process scoped */ r = 1; /* raidx format */ - ric = 0; /* no cluster flush yet */ asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) @@ -84,16 +86,15 @@ static inline void _tlbiel_va(unsigned long va, unsigned long pid, } static inline void _tlbie_va(unsigned long va, unsigned long pid, -unsigned long ap) +unsigned long ap, unsigned long ric) { - unsigned long rb,rs,ric,prs,r; + unsigned long rb,rs,prs,r; rb = va & ~(PPC_BITMASK(52, 63)); rb |= ap << PPC_BITLSHIFT(58); rs = pid << PPC_BITLSHIFT(31); prs = 1; /* process scoped */ r = 1; /* raidx format */ - ric = 0; /* no cluster flush yet */ asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) @@ -119,7 +120,7 @@ void radix__local_flush_tlb_mm(struct mm_struct *mm) preempt_disable(); pid = mm->context.id; if (pid != MMU_NO_CONTEXT) - _tlbiel_pid(pid); + _tlbiel_pid(pid, RIC_FLUSH_ALL); preempt_enable(); } EXPORT_SYMBOL(radix__local_flush_tlb_mm); @@ -132,7 +133,7 @@ void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, preempt_disable(); pid = mm ? mm->context.id : 0; if (pid != MMU_NO_CONTEXT) - _tlbiel_va(vmaddr, pid, ap); + _tlbiel_va(vmaddr, pid, ap, RIC_FLUSH_TLB); preempt_enable(); } @@ -169,11 +170,11 @@ void radix__flush_tlb_mm(struct mm_struct *mm) if (lock_tlbie) raw_spin_lock(&native_tlbie_lock); - _tlbie_pid(pid); + _tlbie_pid(pid, RIC_FLUSH_ALL); if (lock_tlbie) raw_spin_unlock(&native_tlbie_lock); } else - _tlbiel_pid(pid); + _tlbiel_pid(pid, RIC_FLUSH_ALL); no_context: preempt_enable(); } @@ -193,11 +194,11 @@ void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, if (lock_tlbie)
[PATCH V2 03/10] powerpc/mm/radix: Flush page walk cache when freeing page table
Even though a tlb_flush() does a flush with invalidate all cache, we can end up doing an RCU page table free before calling tlb_flush(). That means we can have page walk cache entries even after we free the page table pages. This can result in us doing wrong page table walk. Avoid this by doing pwc flush on every page table free. We can't batch the pwc flush, because the rcu call back function where we free the page table pages doesn't have information of the mmu gather. Thus we have to do a pwc on every page table page freed. Note: I also removed the dummy tlb_flush_pgtable call functions for hash 32. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 1 - arch/powerpc/include/asm/book3s/64/pgalloc.h | 16 - .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 3 ++ arch/powerpc/include/asm/book3s/64/tlbflush.h | 15 - arch/powerpc/include/asm/book3s/pgalloc.h | 5 --- arch/powerpc/mm/tlb-radix.c| 38 ++ 6 files changed, 70 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index a2350194fc76..8e21bb492dca 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -102,7 +102,6 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb, static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) { - tlb_flush_pgtable(tlb, address); pgtable_page_dtor(table); pgtable_free_tlb(tlb, page_address(table), 0); } diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index 488279edb1f0..26eb2cb80c4e 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -110,6 +110,11 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd) static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud, unsigned long address) { + /* +* By now all the pud entries should be none entries. So go +* ahead and flush the page walk cache +*/ + flush_tlb_pgtable(tlb, address); pgtable_free_tlb(tlb, pud, PUD_INDEX_SIZE); } @@ -127,6 +132,11 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd) static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd, unsigned long address) { + /* +* By now all the pud entries should be none entries. So go +* ahead and flush the page walk cache +*/ + flush_tlb_pgtable(tlb, address); return pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX); } @@ -198,7 +208,11 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) { - tlb_flush_pgtable(tlb, address); + /* +* By now all the pud entries should be none entries. So go +* ahead and flush the page walk cache +*/ + flush_tlb_pgtable(tlb, address); pgtable_free_tlb(tlb, table, 0); } diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 13ef38828dfe..3fa94fcac628 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -18,16 +18,19 @@ extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, unsigned long ap, int nid); +extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); extern void radix__tlb_flush(struct mmu_gather *tlb); #ifdef CONFIG_SMP extern void radix__flush_tlb_mm(struct mm_struct *mm); extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, unsigned long ap, int nid); +extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); #else #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm) #define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr) #define radix___flush_tlb_page(mm,addr,p,i) radix___local_flush_tlb_page(mm,addr,p,i) +#define radix__flush_tlb_pwc(tlb, addr)radix__local_flush_tlb_pwc(tlb, addr) #endif #endif diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index d98424ae356c..541cf809e38e 100644
[PATCH V2 04/10] powerpc/mm/radix: Update LPCR HR bit as per ISA
PowerISA 3.0 requires the MMU mode (radix vs. hash) of the hypervisor to be mirrored in the LPCR register, in addition to the partition table. This is done to avoid fetching from the table when deciding, among other things, how to perform transitions to HV mode on some interrupts. So let's set it up appropriately Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/reg.h | 1 + arch/powerpc/mm/pgtable-radix.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index a0948f40bc7b..466816ede138 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -348,6 +348,7 @@ #define LPCR_RMI 0x0002 /* real mode is cache inhibit */ #define LPCR_HDICE 0x0001 /* Hyp Decr enable (HV,PR,EE) */ #define LPCR_UPRT0x0040 /* Use Process Table (ISA 3) */ +#define LPCR_HR 0x0010 #ifndef SPRN_LPID #define SPRN_LPID 0x13F /* Logical Partition Identifier */ #endif diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c index c939e6e57a9e..73aa402047ef 100644 --- a/arch/powerpc/mm/pgtable-radix.c +++ b/arch/powerpc/mm/pgtable-radix.c @@ -340,7 +340,7 @@ void __init radix__early_init_mmu(void) radix_init_page_sizes(); if (!firmware_has_feature(FW_FEATURE_LPAR)) { lpcr = mfspr(SPRN_LPCR); - mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); + mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR); radix_init_partition_table(); } @@ -355,7 +355,7 @@ void radix__early_init_mmu_secondary(void) */ if (!firmware_has_feature(FW_FEATURE_LPAR)) { lpcr = mfspr(SPRN_LPCR); - mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); + mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR); mtspr(SPRN_PTCR, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12)); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 05/10] powerpc/mm: use _raw variant of page table accessors
This switch few of the page table accessor to use the __raw variant and does the cpu to big endian conversion of constants. This helps in generating better code. For ex: a pgd_none(pgd) check with and without fix is listed below Without fix: 2240:20 00 61 eb ld r27,32(r1) /* PGD level */ typedef struct { __be64 pgd; } pgd_t; static inline unsigned long pgd_val(pgd_t x) { return be64_to_cpu(x.pgd); 2244: 22 00 66 78 rldicl r6,r3,32,32 2248: 3e 40 7d 54 rotlwi r29,r3,8 224c: 0e c0 7d 50 rlwimi r29,r3,24,0,7 2250: 3e 40 c5 54 rotlwi r5,r6,8 2254: 2e c4 7d 50 rlwimi r29,r3,24,16,23 2258: 0e c0 c5 50 rlwimi r5,r6,24,0,7 225c: 2e c4 c5 50 rlwimi r5,r6,24,16,23 2260: c6 07 bd 7b rldicr r29,r29,32,31 2264: 78 2b bd 7f or r29,r29,r5 if (pgd_none(pgd)) 2268: 00 00 bd 2f cmpdi cr7,r29,0 226c: 54 03 9e 41 beq cr7,25c0 <__get_user_pages_fast+0x500> With fix: - 2370: 20 00 61 eb ld r27,32(r1) if (pgd_none(pgd)) 2374: 00 00 bd 2f cmpdi cr7,r29,0 2378: a8 03 9e 41 beq cr7,2720 <__get_user_pages_fast+0x530> break; Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/pgtable-4k.h | 6 +- arch/powerpc/include/asm/book3s/64/pgtable-64k.h | 6 +- arch/powerpc/include/asm/book3s/64/pgtable.h | 99 +--- arch/powerpc/include/asm/pgtable-be-types.h | 15 4 files changed, 91 insertions(+), 35 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h index 71e9abced493..9db83b4e017d 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h @@ -11,7 +11,7 @@ static inline int pmd_huge(pmd_t pmd) * leaf pte for huge page */ if (radix_enabled()) - return !!(pmd_val(pmd) & _PAGE_PTE); + return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE)); return 0; } @@ -21,7 +21,7 @@ static inline int pud_huge(pud_t pud) * leaf pte for huge page */ if (radix_enabled()) - return !!(pud_val(pud) & _PAGE_PTE); + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE)); return 0; } @@ -31,7 +31,7 @@ static inline int pgd_huge(pgd_t pgd) * leaf pte for huge page */ if (radix_enabled()) - return !!(pgd_val(pgd) & _PAGE_PTE); + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE)); return 0; } #define pgd_huge pgd_huge diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h index cb2d0a5fa3f8..0d2845b44763 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h @@ -15,7 +15,7 @@ static inline int pmd_huge(pmd_t pmd) /* * leaf pte for huge page */ - return !!(pmd_val(pmd) & _PAGE_PTE); + return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE)); } static inline int pud_huge(pud_t pud) @@ -23,7 +23,7 @@ static inline int pud_huge(pud_t pud) /* * leaf pte for huge page */ - return !!(pud_val(pud) & _PAGE_PTE); + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE)); } static inline int pgd_huge(pgd_t pgd) @@ -31,7 +31,7 @@ static inline int pgd_huge(pgd_t pgd) /* * leaf pte for huge page */ - return !!(pgd_val(pgd) & _PAGE_PTE); + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE)); } #define pgd_huge pgd_huge diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 88a5ecaa157b..d3ab97e3c744 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -317,7 +317,7 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm, { unsigned long old; - if ((pte_val(*ptep) & (_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 0) + if ((pte_raw(*ptep) & cpu_to_be64(_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 0) return 0; old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0); return (old & _PAGE_ACCESSED) != 0; @@ -335,8 +335,7 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - - if ((pte_val(*ptep) & _PAGE_WRITE) == 0) + if ((pte_raw(*ptep) & cpu_to_be64(_PAGE_WRITE)) == 0) return; pte_update(mm, addr, ptep, _PAGE_WRITE, 0, 0); @@ -345,7 +344,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long
[PATCH V2 06/10] powerpc/mm: Compile out radix related functions if RADIX_MMU is disabled
Currently we depend on mmu_has_feature to evalute to zero based on MMU_FTRS_POSSIBLE mask. In a later patch, we want to update radix_enabled() to runtime update the conditional operation to a jump instruction. This implies we cannot depend on MMU_FTRS_POSSIBLE mask. Instead define radix_enabled to return 0 if RADIX_MMU is not enabled. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/mmu.h | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 5854263d4d6e..d4eda6420523 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -23,7 +23,12 @@ struct mmu_psize_def { }; extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; +#ifdef CONFIG_PPC_RADIX_MMU #define radix_enabled() mmu_has_feature(MMU_FTR_RADIX) +#else +#define radix_enabled() (0) +#endif + #endif /* __ASSEMBLY__ */ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 07/10] powerpc/hash: Use the correct ppp mask when updating hpte
With commit: e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO") we now use all the three PPP bits. The top bit is now used to have a PPP value of 0b110 which will be mapped to kernel read only. When updating the hpte entry use right mask such that we update the 63rd bit (top 'P' bit) too. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 + arch/powerpc/mm/hash_native_64.c | 8 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 290157e8d5b2..74839f24f412 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -88,6 +88,7 @@ #define HPTE_R_RPN_SHIFT 12 #define HPTE_R_RPN ASM_CONST(0x0000) #define HPTE_R_PP ASM_CONST(0x0003) +#define HPTE_R_PPP ASM_CONST(0x8003) #define HPTE_R_N ASM_CONST(0x0004) #define HPTE_R_G ASM_CONST(0x0008) #define HPTE_R_M ASM_CONST(0x0010) diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index d873f6507f72..e37916cbc18d 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -316,8 +316,8 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp, DBG_LOW(" -> hit\n"); /* Update the HPTE */ hptep->r = cpu_to_be64((be64_to_cpu(hptep->r) & - ~(HPTE_R_PP | HPTE_R_N)) | - (newpp & (HPTE_R_PP | HPTE_R_N | + ~(HPTE_R_PPP | HPTE_R_N)) | + (newpp & (HPTE_R_PPP | HPTE_R_N | HPTE_R_C))); } native_unlock_hpte(hptep); @@ -385,8 +385,8 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea, /* Update the HPTE */ hptep->r = cpu_to_be64((be64_to_cpu(hptep->r) & - ~(HPTE_R_PP | HPTE_R_N)) | - (newpp & (HPTE_R_PP | HPTE_R_N))); + ~(HPTE_R_PPP | HPTE_R_N)) | + (newpp & (HPTE_R_PPP | HPTE_R_N))); /* * Ensure it is out of the tlb too. Bolted entries base and * actual page size will be same. -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 08/10] powerpc/mm: Clear top 16 bits of va only on older cpus
As per ISA, we need to do this only for architecture version 2.02 and earlier. This continued to work even for 2.07. But let's not do this for anything after 2.02 Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/mmu.h | 12 +--- arch/powerpc/mm/hash_native_64.c | 6 -- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index e53ebebff474..616575fcbcc7 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -24,6 +24,11 @@ /* * This is individual features */ +/* + * We need to clear top 16bits of va (from the remaining 64 bits )in + * tlbie* instructions + */ +#define MMU_FTR_TLBIE_CROP_VA ASM_CONST(0x8000) /* Enable use of high BAT registers */ #define MMU_FTR_USE_HIGH_BATS ASM_CONST(0x0001) @@ -96,8 +101,9 @@ /* MMU feature bit sets for various CPUs */ #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 -#define MMU_FTRS_POWER4MMU_FTRS_DEFAULT_HPTE_ARCH_V2 -#define MMU_FTRS_PPC970MMU_FTRS_POWER4 +#define MMU_FTRS_POWER4MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ + MMU_FTR_TLBIE_CROP_VA +#define MMU_FTRS_PPC970MMU_FTRS_POWER4 | MMU_FTR_TLBIE_CROP_VA #define MMU_FTRS_POWER5MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE #define MMU_FTRS_POWER6MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE #define MMU_FTRS_POWER7MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE @@ -124,7 +130,7 @@ enum { MMU_FTR_USE_TLBRSRV | MMU_FTR_USE_PAIRED_MAS | MMU_FTR_NO_SLBIE_B | MMU_FTR_16M_PAGE | MMU_FTR_TLBIEL | MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE | - MMU_FTR_1T_SEGMENT | + MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA | #ifdef CONFIG_PPC_RADIX_MMU MMU_FTR_RADIX | #endif diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index e37916cbc18d..4c6b68ef571c 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -64,7 +64,8 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize) * Older versions of the architecture (2.02 and earler) require the * masking of the top 16 bits. */ - va &= ~(0xULL << 48); + if (mmu_has_feature(MMU_FTR_TLBIE_CROP_VA)) + va &= ~(0xULL << 48); switch (psize) { case MMU_PAGE_4K: @@ -113,7 +114,8 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize) * Older versions of the architecture (2.02 and earler) require the * masking of the top 16 bits. */ - va &= ~(0xULL << 48); + if (mmu_has_feature(MMU_FTR_TLBIE_CROP_VA)) + va &= ~(0xULL << 48); switch (psize) { case MMU_PAGE_4K: -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 09/10] powerpc/mm: Print formation regarding the the MMU mode
This helps in easily identifying the MMU mode with which the kernel is operating. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/hash_utils_64.c | 3 ++- arch/powerpc/mm/pgtable-radix.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index b2740c67e172..bf9b0b80bbfc 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -720,7 +720,7 @@ static void __init hash_init_partition_table(phys_addr_t hash_table, * For now UPRT is 0 for us. */ partition_tb->patb1 = 0; - DBG("Partition table %p\n", partition_tb); + pr_info("Partition table %p\n", partition_tb); /* * update partition table control register, * 64 K size. @@ -924,6 +924,7 @@ void __init hash__early_init_mmu(void) */ htab_initialize(); + pr_info("Initializing hash mmu with SLB\n"); /* Initialize SLB management */ slb_initialize(); } diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c index 73aa402047ef..d6598cd1c3e6 100644 --- a/arch/powerpc/mm/pgtable-radix.c +++ b/arch/powerpc/mm/pgtable-radix.c @@ -185,7 +185,8 @@ static void __init radix_init_partition_table(void) partition_tb = early_alloc_pgtable(1UL << PATB_SIZE_SHIFT); partition_tb->patb0 = cpu_to_be64(rts_field | __pa(init_mm.pgd) | RADIX_PGD_INDEX_SIZE | PATB_HR); - printk("Partition table %p\n", partition_tb); + pr_info("Initializing Radix MMU\n"); + pr_info("Partition table %p\n", partition_tb); memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE); /* -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 10/10] powerpc/mm/hash: Update SDR1 size encoding as documented in ISA 3.0
ISA 3.0 document hash table size in bytes = 2^(HTABSIZE + 18) No functionality change by this patch. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/hash_utils_64.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index bf9b0b80bbfc..7cce2f6169fa 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -695,10 +695,9 @@ int remove_section_mapping(unsigned long start, unsigned long end) #endif /* CONFIG_MEMORY_HOTPLUG */ static void __init hash_init_partition_table(phys_addr_t hash_table, -unsigned long pteg_count) +unsigned long htab_size) { unsigned long ps_field; - unsigned long htab_size; unsigned long patb_size = 1UL << PATB_SIZE_SHIFT; /* @@ -706,7 +705,7 @@ static void __init hash_init_partition_table(phys_addr_t hash_table, * We can ignore that for lpid 0 */ ps_field = 0; - htab_size = __ilog2(pteg_count) - 11; + htab_size = __ilog2(htab_size) - 18; BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 24), "Partition table size too large."); partition_tb = __va(memblock_alloc_base(patb_size, patb_size, @@ -792,7 +791,7 @@ static void __init htab_initialize(void) htab_address = __va(table); /* htab absolute addr + encoded htabsize */ - _SDR1 = table + __ilog2(pteg_count) - 11; + _SDR1 = table + __ilog2(htab_size_bytes) - 18; /* Initialize the HPT with no entries */ memset((void *)table, 0, htab_size_bytes); @@ -801,7 +800,7 @@ static void __init htab_initialize(void) /* Set SDR1 */ mtspr(SPRN_SDR1, _SDR1); else - hash_init_partition_table(table, pteg_count); + hash_init_partition_table(table, htab_size_bytes); } prot = pgprot_val(PAGE_KERNEL); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 00/16] TLB flush improvments and Segment table support
This series include patches which got posted earlier as independent series. Some of this patches will go upstream via -mm tree. Changes from V1: * Address review feedback * rebase on top of radix fixes which got posted earlier * Fixes for segment table support. NOTE: Even though the patch series include changes to generic mm and other architectures this series is not cross-posted. That is because, the generic mm changes got posted as a separate patch series which can be found at http://thread.gmane.org/gmane.linux.kernel.mm/152620 Aneesh Kumar K.V (16): mm/hugetlb: Simplify hugetlb unmap mm: Change the interface for __tlb_remove_page mm/mmu_gather: Track page size with mmu gather and force flush if page size change powerpc/mm/radix: Implement tlb mmu gather flush efficiently powerpc/mm: Make MMU_FTR_RADIX a MMU family feature powerpc/mm/hash: Add helper for finding SLBE LLP encoding powerpc/mm: Use hugetlb flush functions powerpc/mm: Drop multiple definition of mm_is_core_local powerpc/mm/radix: Add tlb flush of THP ptes powerpc/mm/radix: Rename function and drop unused arg powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range powerpc/mm: remove flush_tlb_page_nohash powerpc/mm: Cleanup LPCR defines powerpc/mm: Switch user slb fault handling to translation enabled powerpc/mm: Support segment table for Power9 arch/arm/include/asm/tlb.h | 29 +- arch/ia64/include/asm/tlb.h| 31 +- arch/powerpc/include/asm/book3s/64/hash.h | 10 + arch/powerpc/include/asm/book3s/64/hugetlb-radix.h | 15 + arch/powerpc/include/asm/book3s/64/mmu-hash.h | 26 ++ arch/powerpc/include/asm/book3s/64/mmu.h | 7 +- arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 - .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 16 +- arch/powerpc/include/asm/book3s/64/tlbflush.h | 27 +- arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/include/asm/kvm_book3s_64.h | 3 +- arch/powerpc/include/asm/mmu.h | 18 +- arch/powerpc/include/asm/mmu_context.h | 5 +- arch/powerpc/include/asm/reg.h | 54 ++-- arch/powerpc/include/asm/tlb.h | 13 + arch/powerpc/include/asm/tlbflush.h| 1 - arch/powerpc/kernel/entry_64.S | 2 +- arch/powerpc/kernel/exceptions-64s.S | 63 +++- arch/powerpc/kernel/prom.c | 3 +- arch/powerpc/mm/hash_native_64.c | 6 +- arch/powerpc/mm/hash_utils_64.c| 86 - arch/powerpc/mm/hugetlbpage-radix.c| 39 +-- arch/powerpc/mm/mmu_context_book3s64.c | 32 +- arch/powerpc/mm/pgtable-book3s64.c | 4 +- arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/mm/slb.c | 359 + arch/powerpc/mm/tlb-radix.c| 104 +- arch/powerpc/mm/tlb_hash32.c | 11 - arch/powerpc/mm/tlb_nohash.c | 6 - arch/s390/include/asm/tlb.h| 22 +- arch/sh/include/asm/tlb.h | 20 +- arch/um/include/asm/tlb.h | 20 +- include/asm-generic/tlb.h | 59 +++- mm/huge_memory.c | 2 +- mm/hugetlb.c | 64 ++-- mm/memory.c| 27 +- 36 files changed, 981 insertions(+), 212 deletions(-) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 01/16] mm/hugetlb: Simplify hugetlb unmap
For hugetlb like THP (and unlike regular page), we do tlb flush after dropping ptl. Because of the above, we don't need to track force_flush like we do now. Instead we can simply call tlb_remove_page() which will do the flush if needed. No functionality change in this patch. Signed-off-by: Aneesh Kumar K.V --- mm/hugetlb.c | 54 +- 1 file changed, 21 insertions(+), 33 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d26162e81fea..741429d01668 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3138,7 +3138,6 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, struct page *ref_page) { - int force_flush = 0; struct mm_struct *mm = vma->vm_mm; unsigned long address; pte_t *ptep; @@ -3157,19 +3156,22 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, tlb_start_vma(tlb, vma); mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); address = start; -again: for (; address < end; address += sz) { ptep = huge_pte_offset(mm, address); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); - if (huge_pmd_unshare(mm, &address, ptep)) - goto unlock; + if (huge_pmd_unshare(mm, &address, ptep)) { + spin_unlock(ptl); + continue; + } pte = huge_ptep_get(ptep); - if (huge_pte_none(pte)) - goto unlock; + if (huge_pte_none(pte)) { + spin_unlock(ptl); + continue; + } /* * Migrating hugepage or HWPoisoned hugepage is already @@ -3177,7 +3179,8 @@ again: */ if (unlikely(!pte_present(pte))) { huge_pte_clear(mm, address, ptep); - goto unlock; + spin_unlock(ptl); + continue; } page = pte_page(pte); @@ -3187,9 +3190,10 @@ again: * are about to unmap is the actual page of interest. */ if (ref_page) { - if (page != ref_page) - goto unlock; - + if (page != ref_page) { + spin_unlock(ptl); + continue; + } /* * Mark the VMA as having unmapped its page so that * future faults in this VMA will fail rather than @@ -3205,30 +3209,14 @@ again: hugetlb_count_sub(pages_per_huge_page(h), mm); page_remove_rmap(page, true); - force_flush = !__tlb_remove_page(tlb, page); - if (force_flush) { - address += sz; - spin_unlock(ptl); - break; - } - /* Bail out after unmapping reference page if supplied */ - if (ref_page) { - spin_unlock(ptl); - break; - } -unlock: + spin_unlock(ptl); - } - /* -* mmu_gather ran out of room to batch pages, we break out of -* the PTE lock to avoid doing the potential expensive TLB invalidate -* and page-free while holding it. -*/ - if (force_flush) { - force_flush = 0; - tlb_flush_mmu(tlb); - if (address < end && !ref_page) - goto again; + tlb_remove_page(tlb, page); + /* +* Bail out after unmapping reference page if supplied +*/ + if (ref_page) + break; } mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); tlb_end_vma(tlb, vma); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 02/16] mm: Change the interface for __tlb_remove_page
This update the generic and arch specific implementation to return true if we need to do a tlb flush. That means if a __tlb_remove_page indicate a flush is needed, the page we try to remove need to be tracked and added again after the flush. We need to track it because we have already update the pte to none and we can't just loop back. This changes is done to enable us to do a tlb_flush when we try to flush a range that consists of different page sizes. For architectures like ppc64, we can do a range based tlb flush and we need to track page size for that. When we try to remove a huge page, we will force a tlb flush and starts a new mmu gather. Signed-off-by: Aneesh Kumar K.V --- arch/arm/include/asm/tlb.h | 17 + arch/ia64/include/asm/tlb.h | 19 ++- arch/s390/include/asm/tlb.h | 9 +++-- arch/sh/include/asm/tlb.h | 8 +++- arch/um/include/asm/tlb.h | 8 +++- include/asm-generic/tlb.h | 44 +--- mm/memory.c | 19 +-- 7 files changed, 94 insertions(+), 30 deletions(-) diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h index 3cadb726ec88..a9d2aee3826f 100644 --- a/arch/arm/include/asm/tlb.h +++ b/arch/arm/include/asm/tlb.h @@ -209,17 +209,26 @@ tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) tlb_flush(tlb); } -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page) +static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page) { + if (tlb->nr == tlb->max) + return true; tlb->pages[tlb->nr++] = page; - VM_BUG_ON(tlb->nr > tlb->max); - return tlb->max - tlb->nr; + return false; } static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) { - if (!__tlb_remove_page(tlb, page)) + if (__tlb_remove_page(tlb, page)) { tlb_flush_mmu(tlb); + __tlb_remove_page(tlb, page); + } +} + +static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, +struct page *page) +{ + return __tlb_remove_page(tlb, page); } static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, diff --git a/arch/ia64/include/asm/tlb.h b/arch/ia64/include/asm/tlb.h index 39d64e0df1de..e7da41aa9110 100644 --- a/arch/ia64/include/asm/tlb.h +++ b/arch/ia64/include/asm/tlb.h @@ -205,17 +205,18 @@ tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end) * must be delayed until after the TLB has been flushed (see comments at the beginning of * this file). */ -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page) +static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page) { + if (tlb->nr == tlb->max) + return true; + tlb->need_flush = 1; if (!tlb->nr && tlb->pages == tlb->local) __tlb_alloc_page(tlb); tlb->pages[tlb->nr++] = page; - VM_BUG_ON(tlb->nr > tlb->max); - - return tlb->max - tlb->nr; + return false; } static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) @@ -235,8 +236,16 @@ static inline void tlb_flush_mmu(struct mmu_gather *tlb) static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) { - if (!__tlb_remove_page(tlb, page)) + if (__tlb_remove_page(tlb, page)) { tlb_flush_mmu(tlb); + __tlb_remove_page(tlb, page); + } +} + +static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, +struct page *page) +{ + return __tlb_remove_page(tlb, page); } /* diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index 7a92e69c50bc..30759b560849 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -87,10 +87,10 @@ static inline void tlb_finish_mmu(struct mmu_gather *tlb, * tlb_ptep_clear_flush. In both flush modes the tlb for a page cache page * has already been freed, so just do free_page_and_swap_cache. */ -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page) +static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page) { free_page_and_swap_cache(page); - return 1; /* avoid calling tlb_flush_mmu */ + return false; /* avoid calling tlb_flush_mmu */ } static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) @@ -98,6 +98,11 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) free_page_and_swap_cache(page); } +static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, +struct page *page) +{ + return __tlb_remove_page(tlb, page); +} /* * pte_free_tlb frees a pte table and clears the CRSTE for the * page table from the tlb. diff -
[PATCH V2 03/16] mm/mmu_gather: Track page size with mmu gather and force flush if page size change
This allows arch which need to do special handing with respect to different page size when flushing tlb to implement the same in mmu gather Signed-off-by: Aneesh Kumar K.V --- arch/arm/include/asm/tlb.h | 12 arch/ia64/include/asm/tlb.h | 12 arch/s390/include/asm/tlb.h | 13 + arch/sh/include/asm/tlb.h | 12 arch/um/include/asm/tlb.h | 12 include/asm-generic/tlb.h | 27 +-- mm/huge_memory.c| 2 +- mm/hugetlb.c| 2 +- mm/memory.c | 10 +- 9 files changed, 93 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h index a9d2aee3826f..1e25cd80589e 100644 --- a/arch/arm/include/asm/tlb.h +++ b/arch/arm/include/asm/tlb.h @@ -225,12 +225,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) } } +static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return __tlb_remove_page(tlb, page); +} + static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, struct page *page) { return __tlb_remove_page(tlb, page); } +static inline void tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return tlb_remove_page(tlb, page); +} + static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr) { diff --git a/arch/ia64/include/asm/tlb.h b/arch/ia64/include/asm/tlb.h index e7da41aa9110..77e541cf0e5d 100644 --- a/arch/ia64/include/asm/tlb.h +++ b/arch/ia64/include/asm/tlb.h @@ -242,12 +242,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) } } +static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return __tlb_remove_page(tlb, page); +} + static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, struct page *page) { return __tlb_remove_page(tlb, page); } +static inline void tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return tlb_remove_page(tlb, page); +} + /* * Remove TLB entry for PTE mapped at virtual address ADDRESS. This is called for any * PTE, not just those pointing to (normal) physical memory. diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index 30759b560849..15711de10403 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -98,11 +98,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) free_page_and_swap_cache(page); } +static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return __tlb_remove_page(tlb, page); +} + static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, struct page *page) { return __tlb_remove_page(tlb, page); } + +static inline void tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return tlb_remove_page(tlb, page); +} + /* * pte_free_tlb frees a pte table and clears the CRSTE for the * page table from the tlb. diff --git a/arch/sh/include/asm/tlb.h b/arch/sh/include/asm/tlb.h index 21ae8f5546b2..025cdb1032f6 100644 --- a/arch/sh/include/asm/tlb.h +++ b/arch/sh/include/asm/tlb.h @@ -109,12 +109,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) __tlb_remove_page(tlb, page); } +static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return __tlb_remove_page(tlb, page); +} + static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb, struct page *page) { return __tlb_remove_page(tlb, page); } +static inline void tlb_remove_page_size(struct mmu_gather *tlb, + struct page *page, int page_size) +{ + return tlb_remove_page(tlb, page); +} + #define pte_free_tlb(tlb, ptep, addr) pte_free((tlb)->mm, ptep) #define pmd_free_tlb(tlb, pmdp, addr) pmd_free((tlb)->mm, pmdp) #define pud_free_tlb(tlb, pudp, addr) pud_free((tlb)->mm, pudp) diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h index 3dc4cbb3c2c0..821ff0acfe17 100644 --- a/arch/um/include/asm/tlb.h +++ b/arch/um/include/asm/tlb.h @@ -110,12 +110,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page) __tlb_remove_page(tlb, page); }
[PATCH V2 04/16] powerpc/mm/radix: Implement tlb mmu gather flush efficiently
Now that we track page size in mmu_gather, we can use address based tlbie format when doing a tlb_flush(). We don't do this if we are invalidating the full address space. Signed-off-by: Aneesh Kumar K.V --- .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 + arch/powerpc/mm/tlb-radix.c| 73 +- 2 files changed, 74 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 3fa94fcac628..862c8fa50268 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -10,6 +10,8 @@ static inline int mmu_get_ap(int psize) return mmu_psize_defs[psize].ap; } +extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, +unsigned long end, int psize); extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 231e3ed2e684..03e719ee6747 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -279,9 +279,80 @@ void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, } EXPORT_SYMBOL(radix__flush_tlb_range); +static int radix_get_mmu_psize(int page_size) +{ + int psize; + + if (page_size == (1UL << mmu_psize_defs[mmu_virtual_psize].shift)) + psize = mmu_virtual_psize; + else if (page_size == (1UL << mmu_psize_defs[MMU_PAGE_2M].shift)) + psize = MMU_PAGE_2M; + else if (page_size == (1UL << mmu_psize_defs[MMU_PAGE_1G].shift)) + psize = MMU_PAGE_1G; + else + return -1; + return psize; +} void radix__tlb_flush(struct mmu_gather *tlb) { + int psize = 0; struct mm_struct *mm = tlb->mm; - radix__flush_tlb_mm(mm); + int page_size = tlb->page_size; + + psize = radix_get_mmu_psize(page_size); + /* +* if page size is not something we understand, do a full mm flush +*/ + if (psize != -1 && !tlb->fullmm && !tlb->need_flush_all) + radix__flush_tlb_range_psize(mm, tlb->start, tlb->end, psize); + else + radix__flush_tlb_mm(mm); +} + +#define TLB_FLUSH_ALL -1UL +/* + * Number of pages above which we will do a bcast tlbie. Just a + * number at this point copied from x86 + */ +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; + +void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize) +{ + unsigned long pid; + unsigned long addr; + int local = mm_is_core_local(mm); + unsigned long ap = mmu_get_ap(psize); + int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); + unsigned long page_size = 1UL << mmu_psize_defs[psize].shift; + + + preempt_disable(); + pid = mm ? mm->context.id : 0; + if (unlikely(pid == MMU_NO_CONTEXT)) + goto err_out; + + if (end == TLB_FLUSH_ALL || + (end - start) > tlb_single_page_flush_ceiling * page_size) { + if (local) + _tlbiel_pid(pid, RIC_FLUSH_TLB); + else + _tlbie_pid(pid, RIC_FLUSH_TLB); + goto err_out; + } + for (addr = start; addr < end; addr += page_size) { + + if (local) + _tlbiel_va(addr, pid, ap, RIC_FLUSH_TLB); + else { + if (lock_tlbie) + raw_spin_lock(&native_tlbie_lock); + _tlbie_va(addr, pid, ap, RIC_FLUSH_TLB); + if (lock_tlbie) + raw_spin_unlock(&native_tlbie_lock); + } + } +err_out: + preempt_enable(); } -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 05/16] powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
MMU feature bits are defined such that we use the lower half to present MMU family features. Remove the strict split of half and also move Radix to a mmu family feature. Radix introduce a new MMU model and strictly speaking it is a new MMU family. This also free up bits which can be used for individual features later. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/mmu.h | 3 +-- arch/powerpc/include/asm/mmu.h | 16 +++- arch/powerpc/kernel/entry_64.S | 2 +- arch/powerpc/kernel/exceptions-64s.S | 8 arch/powerpc/kernel/prom.c | 2 +- 5 files changed, 14 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index d4eda6420523..6d8306d9aa7a 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -24,12 +24,11 @@ struct mmu_psize_def { extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; #ifdef CONFIG_PPC_RADIX_MMU -#define radix_enabled() mmu_has_feature(MMU_FTR_RADIX) +#define radix_enabled() mmu_has_feature(MMU_FTR_TYPE_RADIX) #else #define radix_enabled() (0) #endif - #endif /* __ASSEMBLY__ */ /* 64-bit classic hash table MMU */ diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 616575fcbcc7..21b71469e66b 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -12,7 +12,7 @@ */ /* - * First half is MMU families + * MMU families */ #define MMU_FTR_HPTE_TABLE ASM_CONST(0x0001) #define MMU_FTR_TYPE_8xx ASM_CONST(0x0002) @@ -20,9 +20,12 @@ #define MMU_FTR_TYPE_44x ASM_CONST(0x0008) #define MMU_FTR_TYPE_FSL_E ASM_CONST(0x0010) #define MMU_FTR_TYPE_47x ASM_CONST(0x0020) - /* - * This is individual features + * Radix page table available + */ +#define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040) +/* + * individual features */ /* * We need to clear top 16bits of va (from the remaining 64 bits )in @@ -93,11 +96,6 @@ */ #define MMU_FTR_1T_SEGMENT ASM_CONST(0x4000) -/* - * Radix page table available - */ -#define MMU_FTR_RADIX ASM_CONST(0x8000) - /* MMU feature bit sets for various CPUs */ #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 @@ -132,7 +130,7 @@ enum { MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE | MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA | #ifdef CONFIG_PPC_RADIX_MMU - MMU_FTR_RADIX | + MMU_FTR_TYPE_RADIX | #endif 0, }; diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 73e461a3dfbb..dd26d4ed7513 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -532,7 +532,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) #ifdef CONFIG_PPC_STD_MMU_64 BEGIN_MMU_FTR_SECTION b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) BEGIN_FTR_SECTION clrrdi r6,r8,28/* get its ESID */ clrrdi r9,r1,28/* get current sp ESID */ diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4c9440629128..f2bd375b9a4e 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -945,7 +945,7 @@ BEGIN_MMU_FTR_SECTION b do_hash_page/* Try to handle as hpte fault */ MMU_FTR_SECTION_ELSE b handle_page_fault -ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX) +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) .align 7 .globl h_data_storage_common @@ -976,7 +976,7 @@ BEGIN_MMU_FTR_SECTION b do_hash_page/* Try to handle as hpte fault */ MMU_FTR_SECTION_ELSE b handle_page_fault -ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX) +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) STD_EXCEPTION_COMMON(0xe20, h_instr_storage, unknown_exception) @@ -1390,7 +1390,7 @@ slb_miss_realmode: #ifdef CONFIG_PPC_STD_MMU_64 BEGIN_MMU_FTR_SECTION bl slb_allocate_realmode -END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX) #endif /* All done -- return from exception. */ @@ -1401,7 +1401,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX) mtlrr10 BEGIN_MMU_FTR_SECTION b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) andi. r10,r12,MSR_RI /* check for unrecoverable exception */ beq-2f diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 946e34ffeae9..44b4417804db 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -168,7 +168,7 @@ static struct ibm_pa_feature { */ {
[PATCH V2 06/16] powerpc/mm/hash: Add helper for finding SLBE LLP encoding
Replace opencoding of the same at multiple places with the helper. No functional change with this patch. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 9 + arch/powerpc/include/asm/kvm_book3s_64.h | 3 +-- arch/powerpc/mm/hash_native_64.c | 6 ++ 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 74839f24f412..b042e5f9a428 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -151,6 +151,15 @@ static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize) BUG(); } +static inline unsigned long get_sllp_encoding(int psize) +{ + unsigned long sllp; + + sllp = ((mmu_psize_defs[psize].sllp & SLB_VSID_L) >> 6) | + ((mmu_psize_defs[psize].sllp & SLB_VSID_LP) >> 4); + return sllp; +} + #endif /* __ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h index 1f4497fb5b83..88d17b4ea9c8 100644 --- a/arch/powerpc/include/asm/kvm_book3s_64.h +++ b/arch/powerpc/include/asm/kvm_book3s_64.h @@ -181,8 +181,7 @@ static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r, switch (b_psize) { case MMU_PAGE_4K: - sllp = ((mmu_psize_defs[a_psize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[a_psize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(a_psize); rb |= sllp << 5;/* AP field */ rb |= (va_low & 0x7ff) << 12; /* remaining 11 bits of AVA */ break; diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index 4c6d4c736ba4..fd483948981a 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -72,8 +72,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize) /* clear out bits after (52) [052.63] */ va &= ~((1ul << (64 - 52)) - 1); va |= ssize << 8; - sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(apsize); va |= sllp << 5; asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2) : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206) @@ -122,8 +121,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize) /* clear out bits after(52) [052.63] */ va &= ~((1ul << (64 - 52)) - 1); va |= ssize << 8; - sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(apsize); va |= sllp << 5; asm volatile(".long 0x7c000224 | (%0 << 11) | (0 << 21)" : : "r"(va) : "memory"); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 07/16] powerpc/mm: Use hugetlb flush functions
Use flush_hugetlb_page instead of flush_tlb_page when we clear flush the pte. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/hugetlb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index e2d9f4996e5c..c5517f463ec7 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -147,7 +147,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, { pte_t pte; pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); - flush_tlb_page(vma, addr); + flush_hugetlb_page(vma, addr); } static inline int huge_pte_none(pte_t pte) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 08/16] powerpc/mm: Drop multiple definition of mm_is_core_local
Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/tlb.h | 13 + arch/powerpc/mm/tlb-radix.c| 6 -- arch/powerpc/mm/tlb_nohash.c | 6 -- 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 20733fa518ae..f6f68f73e858 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -46,5 +46,18 @@ static inline void __tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, #endif } +#ifdef CONFIG_SMP +static inline int mm_is_core_local(struct mm_struct *mm) +{ + return cpumask_subset(mm_cpumask(mm), + topology_sibling_cpumask(smp_processor_id())); +} +#else +static inline int mm_is_core_local(struct mm_struct *mm) +{ + return 1; +} +#endif + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_TLB_H */ diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 03e719ee6747..74b0c90045ab 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -163,12 +163,6 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd EXPORT_SYMBOL(radix__local_flush_tlb_page); #ifdef CONFIG_SMP -static int mm_is_core_local(struct mm_struct *mm) -{ - return cpumask_subset(mm_cpumask(mm), - topology_sibling_cpumask(smp_processor_id())); -} - void radix__flush_tlb_mm(struct mm_struct *mm) { unsigned long pid; diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index f4668488512c..050badc0ebd3 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -215,12 +215,6 @@ EXPORT_SYMBOL(local_flush_tlb_page); static DEFINE_RAW_SPINLOCK(tlbivax_lock); -static int mm_is_core_local(struct mm_struct *mm) -{ - return cpumask_subset(mm_cpumask(mm), - topology_sibling_cpumask(smp_processor_id())); -} - struct tlb_flush_param { unsigned long addr; unsigned int pid; -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 09/16] powerpc/mm/radix: Add tlb flush of THP ptes
Instead of flushing the entire mm, implement a flush_pmd_tlb_range Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 ++ arch/powerpc/include/asm/book3s/64/tlbflush.h | 9 + arch/powerpc/mm/pgtable-book3s64.c | 4 ++-- arch/powerpc/mm/tlb-radix.c | 7 +++ 4 files changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 862c8fa50268..54c0aac39e3e 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -12,6 +12,8 @@ static inline int mmu_get_ap(int psize) extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize); +extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end); extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 541cf809e38e..5f322e0ed385 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -7,6 +7,15 @@ #include #include +#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE +static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end); + return hash__flush_tlb_range(vma, start, end); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 670318766545..7bb8acffe876 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -33,7 +33,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, changed = !pmd_same(*(pmdp), entry); if (changed) { __ptep_set_access_flags(pmdp_ptep(pmdp), pmd_pte(entry)); - flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); } return changed; } @@ -66,7 +66,7 @@ void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, 0); - flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); /* * This ensures that generic code that rely on IRQ disabling * to prevent a parallel THP split work as expected. diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 74b0c90045ab..4212e7638a6f 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -350,3 +350,10 @@ void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, err_out: preempt_enable(); } + +void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + radix__flush_tlb_range_psize(vma->vm_mm, start, end, MMU_PAGE_2M); +} +EXPORT_SYMBOL(radix__flush_pmd_tlb_range); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 10/16] powerpc/mm/radix: Rename function and drop unused arg
Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 10 +- arch/powerpc/mm/hugetlbpage-radix.c | 4 ++-- arch/powerpc/mm/tlb-radix.c | 16 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 54c0aac39e3e..00c354064280 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -20,20 +20,20 @@ extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); -extern void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid); extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); +extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap); extern void radix__tlb_flush(struct mmu_gather *tlb); #ifdef CONFIG_SMP extern void radix__flush_tlb_mm(struct mm_struct *mm); extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); -extern void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid); extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); +extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap); #else #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm) #define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr) -#define radix___flush_tlb_page(mm,addr,p,i) radix___local_flush_tlb_page(mm,addr,p,i) +#define radix__flush_tlb_page_psize(mm,addr,p) radix__local_flush_tlb_page_psize(mm,addr,p) #define radix__flush_tlb_pwc(tlb, addr)radix__local_flush_tlb_pwc(tlb, addr) #endif diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 1e11559e1aac..0dfa1816f0c6 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -20,7 +20,7 @@ void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) WARN(1, "Wrong huge page shift\n"); return ; } - radix___flush_tlb_page(vma->vm_mm, vmaddr, ap, 0); + radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); } void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) @@ -37,7 +37,7 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long v WARN(1, "Wrong huge page shift\n"); return ; } - radix___local_flush_tlb_page(vma->vm_mm, vmaddr, ap, 0); + radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); } /* diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 4212e7638a6f..c33c3f24bad2 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -138,8 +138,8 @@ void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr) } EXPORT_SYMBOL(radix__local_flush_tlb_pwc); -void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid) +void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap) { unsigned long pid; @@ -157,8 +157,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd if (vma && is_vm_hugetlb_page(vma)) return __local_flush_hugetlb_page(vma, vmaddr); #endif - radix___local_flush_tlb_page(vma ? vma->vm_mm : NULL, vmaddr, - mmu_get_ap(mmu_virtual_psize), 0); + radix__local_flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr, + mmu_get_ap(mmu_virtual_psize)); } EXPORT_SYMBOL(radix__local_flush_tlb_page); @@ -212,8 +212,8 @@ no_context: } EXPORT_SYMBOL(radix__flush_tlb_pwc); -void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid) +void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, +unsigned long ap) { unsigned long pid; @@ -241,8 +241,8 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr) if (vma && is_vm_hugetlb_page(vma)) return flush_hugetlb_page(vma, vmaddr); #endif - radix___flush_tlb_page(vma ? vma->vm_mm
[PATCH V2 11/16] powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
Use the helper instead of open coding the same at multiple place Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hugetlb-radix.h | 15 +++ .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 4 +-- arch/powerpc/mm/hugetlbpage-radix.c| 29 ++ arch/powerpc/mm/tlb-radix.c| 10 +--- 4 files changed, 30 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h index 60f47649306f..c45189aa7476 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h @@ -11,4 +11,19 @@ extern unsigned long radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); + +static inline int hstate_get_psize(struct hstate *hstate) +{ + unsigned long shift; + + shift = huge_page_shift(hstate); + if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) + return MMU_PAGE_2M; + else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) + return MMU_PAGE_1G; + else { + WARN(1, "Wrong huge page shift\n"); + return mmu_virtual_psize; + } +} #endif diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 00c354064280..efb13bbc6df2 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -22,14 +22,14 @@ extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap); + int psize); extern void radix__tlb_flush(struct mmu_gather *tlb); #ifdef CONFIG_SMP extern void radix__flush_tlb_mm(struct mm_struct *mm); extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr); extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap); + int psize); #else #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm) #define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr) diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 0dfa1816f0c6..1eca0deaf89b 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -5,39 +5,24 @@ #include #include #include +#include void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { - unsigned long ap, shift; + int psize; struct hstate *hstate = hstate_file(vma->vm_file); - shift = huge_page_shift(hstate); - if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) - ap = mmu_get_ap(MMU_PAGE_2M); - else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) - ap = mmu_get_ap(MMU_PAGE_1G); - else { - WARN(1, "Wrong huge page shift\n"); - return ; - } - radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); + psize = hstate_get_psize(hstate); + radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { - unsigned long ap, shift; + int psize; struct hstate *hstate = hstate_file(vma->vm_file); - shift = huge_page_shift(hstate); - if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) - ap = mmu_get_ap(MMU_PAGE_2M); - else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) - ap = mmu_get_ap(MMU_PAGE_1G); - else { - WARN(1, "Wrong huge page shift\n"); - return ; - } - radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); + psize = hstate_get_psize(hstate); + radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } /* diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index c33c3f24bad2..a32d8aab2376 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -139,9 +139,10 @@ void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr) EXPORT_SYMBOL(radix__local_flush_tlb_pwc); void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, -
[PATCH V2 12/16] powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
Some archs like ppc64 need to do special things when flushing tlb for hugepage. Add a new helper to flush hugetlb tlb range. This helps us to avoid flushing the entire tlb mapping for the pid. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 ++ arch/powerpc/include/asm/book3s/64/tlbflush.h | 10 ++ arch/powerpc/mm/hugetlbpage-radix.c | 10 ++ mm/hugetlb.c| 10 +- 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index efb13bbc6df2..91178f0f5ad8 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -10,6 +10,8 @@ static inline int mmu_get_ap(int psize) return mmu_psize_defs[psize].ap; } +extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end); extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize); extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 5f322e0ed385..02cd7def893d 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -16,6 +16,16 @@ static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, return hash__flush_tlb_range(vma, start, end); } +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + if (radix_enabled()) + return radix__flush_hugetlb_tlb_range(vma, start, end); + return hash__flush_tlb_range(vma, start, end); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 1eca0deaf89b..35254a678456 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -25,6 +25,16 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long v radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } +void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + int psize; + struct hstate *hstate = hstate_file(vma->vm_file); + + psize = hstate_get_psize(hstate); + radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); +} + /* * A vairant of hugetlb_get_unmapped_area doing topdown search * FIXME!! should we do as x86 does or non hugetlb area does ? diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cab0b1861670..3495c519583d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3897,6 +3897,14 @@ same_page: return i ? i : -EFAULT; } +#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +/* + * ARCHes with special requirements for evicting HUGETLB backing TLB entries can + * implement this. + */ +#define flush_hugetlb_tlb_range(vma, addr, end)flush_tlb_range(vma, addr, end) +#endif + unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long address, unsigned long end, pgprot_t newprot) { @@ -3957,7 +3965,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * once we release i_mmap_rwsem, another task can do the final put_page * and that page table be reused and filled with junk. */ - flush_tlb_range(vma, start, end); + flush_hugetlb_tlb_range(vma, start, end); mmu_notifier_invalidate_range(mm, start, end); i_mmap_unlock_write(vma->vm_file->f_mapping); mmu_notifier_invalidate_range_end(mm, start, end); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 13/16] powerpc/mm: remove flush_tlb_page_nohash
This should be same as flush_tlb_page except for hash32. For hash32 I guess the existing code is wrong, because we don't seem to be flushing tlb for Hash != 0 case at all. Fix this by switching to calling flush_tlb_page() which does the right thing by flushing tlb for both hash and nohash case with hash32 Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 - arch/powerpc/include/asm/book3s/64/tlbflush.h | 8 arch/powerpc/include/asm/tlbflush.h| 1 - arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/mm/tlb_hash32.c | 11 --- 5 files changed, 1 insertion(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h index f12ddf5e8de5..2f6373144e2c 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h @@ -75,11 +75,6 @@ static inline void hash__flush_tlb_page(struct vm_area_struct *vma, { } -static inline void hash__flush_tlb_page_nohash(struct vm_area_struct *vma, - unsigned long vmaddr) -{ -} - static inline void hash__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 02cd7def893d..7f942c361ea9 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -57,14 +57,6 @@ static inline void local_flush_tlb_page(struct vm_area_struct *vma, return hash__local_flush_tlb_page(vma, vmaddr); } -static inline void flush_tlb_page_nohash(struct vm_area_struct *vma, -unsigned long vmaddr) -{ - if (radix_enabled()) - return radix__flush_tlb_page(vma, vmaddr); - return hash__flush_tlb_page_nohash(vma, vmaddr); -} - static inline void tlb_flush(struct mmu_gather *tlb) { if (radix_enabled()) diff --git a/arch/powerpc/include/asm/tlbflush.h b/arch/powerpc/include/asm/tlbflush.h index 1b38eea28e5a..13dbcd41885e 100644 --- a/arch/powerpc/include/asm/tlbflush.h +++ b/arch/powerpc/include/asm/tlbflush.h @@ -54,7 +54,6 @@ extern void __flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, #define flush_tlb_page(vma,addr) local_flush_tlb_page(vma,addr) #define __flush_tlb_page(mm,addr,p,i) __local_flush_tlb_page(mm,addr,p,i) #endif -#define flush_tlb_page_nohash(vma,addr)flush_tlb_page(vma,addr) #elif defined(CONFIG_PPC_STD_MMU_32) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 88a307504b5a..0b6fb244d0a1 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -225,7 +225,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, if (!is_vm_hugetlb_page(vma)) assert_pte_locked(vma->vm_mm, address); __ptep_set_access_flags(ptep, entry); - flush_tlb_page_nohash(vma, address); + flush_tlb_page(vma, address); } return changed; } diff --git a/arch/powerpc/mm/tlb_hash32.c b/arch/powerpc/mm/tlb_hash32.c index 558e30cce33e..702d7689d714 100644 --- a/arch/powerpc/mm/tlb_hash32.c +++ b/arch/powerpc/mm/tlb_hash32.c @@ -49,17 +49,6 @@ void flush_hash_entry(struct mm_struct *mm, pte_t *ptep, unsigned long addr) EXPORT_SYMBOL(flush_hash_entry); /* - * Called by ptep_set_access_flags, must flush on CPUs for which the - * DSI handler can't just "fixup" the TLB on a write fault - */ -void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr) -{ - if (Hash != 0) - return; - _tlbie(addr); -} - -/* * Called at the end of a mmu_gather operation to make sure the * TLB flush is completely done. */ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 14/16] powerpc/mm: Cleanup LPCR defines
This makes it easy to verify we are not overloading the bits. No functionality change by this patch. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/reg.h | 54 +- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 466816ede138..5c4b6f4f8903 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -314,41 +314,41 @@ #define HFSCR_FP __MASK(FSCR_FP_LG) #define SPRN_TAR 0x32f /* Target Address Register */ #define SPRN_LPCR 0x13E /* LPAR Control Register */ -#define LPCR_VPM0(1ul << (63-0)) -#define LPCR_VPM1(1ul << (63-1)) -#define LPCR_ISL (1ul << (63-2)) +#define LPCR_VPM0ASM_CONST(0x8000) +#define LPCR_VPM1ASM_CONST(0x4000) +#define LPCR_ISL ASM_CONST(0x2000) #define LPCR_VC_SH (63-2) #define LPCR_DPFD_SH (63-11) #define LPCR_DPFD(7ul << LPCR_DPFD_SH) #define LPCR_VRMASD (0x1ful << (63-16)) -#define LPCR_VRMA_L (1ul << (63-12)) -#define LPCR_VRMA_LP0(1ul << (63-15)) -#define LPCR_VRMA_LP1(1ul << (63-16)) +#define LPCR_VRMA_L ASM_CONST(0x0008) +#define LPCR_VRMA_LP0ASM_CONST(0x0001) +#define LPCR_VRMA_LP1ASM_CONST(0x8000) #define LPCR_VRMASD_SH (63-16) -#define LPCR_RMLS0x1C00 /* impl dependent rmo limit sel */ +#define LPCR_RMLS 0x1C00 /* impl dependent rmo limit sel */ #define LPCR_RMLS_SH (63-37) -#define LPCR_ILE 0x0200 /* !HV irqs set MSR:LE */ -#define LPCR_AIL 0x0180 /* Alternate interrupt location */ -#define LPCR_AIL_0 0x /* MMU off exception offset 0x0 */ -#define LPCR_AIL_3 0x0180 /* MMU on exception offset 0xc00...4xxx */ -#define LPCR_ONL 0x0004 /* online - PURR/SPURR count */ -#define LPCR_PECE0x0001f000 /* powersave exit cause enable */ -#define LPCR_PECEDP0x0001 /* directed priv dbells cause exit */ -#define LPCR_PECEDH0x8000 /* directed hyp dbells cause exit */ -#define LPCR_PECE0 0x4000 /* ext. exceptions can cause exit */ -#define LPCR_PECE1 0x2000 /* decrementer can cause exit */ -#define LPCR_PECE2 0x1000 /* machine check etc can cause exit */ -#define LPCR_MER 0x0800 /* Mediated External Exception */ +#define LPCR_ILE ASM_CONST(0x0200) /* !HV irqs set MSR:LE */ +#define LPCR_AIL ASM_CONST(0x0180) /* Alternate interrupt location */ +#define LPCR_AIL_0 ASM_CONST(0x) /* MMU off exception offset 0x0 */ +#define LPCR_AIL_3 ASM_CONST(0x0180) /* MMU on exception offset 0xc00...4xxx */ +#define LPCR_ONL ASM_CONST(0x0004) /* online - PURR/SPURR count */ +#define LPCR_PECEASM_CONST(0x0001f000) /* powersave exit cause enable */ +#define LPCR_PECEDP ASM_CONST(0x0001) /* directed priv dbells cause exit */ +#define LPCR_PECEDH ASM_CONST(0x8000) /* directed hyp dbells cause exit */ +#define LPCR_PECE0 ASM_CONST(0x4000) /* ext. exceptions can cause exit */ +#define LPCR_PECE1 ASM_CONST(0x2000) /* decrementer can cause exit */ +#define LPCR_PECE2 ASM_CONST(0x1000) /* machine check etc can cause exit */ +#define LPCR_MER ASM_CONST(0x0800) /* Mediated External Exception */ #define LPCR_MER_SH 11 -#define LPCR_TC 0x0200 /* Translation control */ -#define LPCR_LPES0x000c -#define LPCR_LPES0 0x0008 /* LPAR Env selector 0 */ -#define LPCR_LPES1 0x0004 /* LPAR Env selector 1 */ +#define LPCR_TC ASM_CONST(0x0200) /* Translation control */ +#define LPCR_LPES0x000c +#define LPCR_LPES0 ASM_CONST(0x0008) /* LPAR Env selector 0 */ +#define LPCR_LPES1 ASM_CONST(0x0004) /* LPAR Env selector 1 */ #define LPCR_LPES_SH 2 -#define LPCR_RMI 0x0002 /* real mode is cache inhibit */ -#define LPCR_HDICE 0x0001 /* Hyp Decr enable (HV,PR,EE) */ -#define LPCR_UPRT0x0040 /* Use Process Table (ISA 3) */ -#define LPCR_HR 0x0010 +#define LPCR_RMI ASM_CONST(0x0002) /* real mode is cache inhibit */ +#define LPCR_HDICE ASM_CONST(0x0001) /* Hyp Decr enable (HV,PR,EE) */ +#define LPCR_UPRTASM_CONST(0x0040) /* Use Process Table (ISA 3) */ +#define LPCR
[PATCH V2 15/16] powerpc/mm: Switch user slb fault handling to translation enabled
We also handle fault with proper stack initialized. This enable us to callout to C in fault handling routines. We don't do this for kernel mapping, because of the possibility of taking recursive fault if kernel stack in not yet mapped by an slb entry. This enable us to handle Power9 slb fault better. We will add bolted entries for the entire kernel mapping in segment table and user slb entries we take fault and insert on demand. With translation on, we should be able to access segment table from fault handler. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/kernel/exceptions-64s.S | 55 arch/powerpc/mm/slb.c| 11 2 files changed, 61 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index f2bd375b9a4e..2f2c52559ea9 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -794,7 +794,7 @@ data_access_slb_relon_pSeries: mfspr r3,SPRN_DAR mfspr r12,SPRN_SRR1 #ifndef CONFIG_RELOCATABLE - b slb_miss_realmode + b handle_slb_miss_relon #else /* * We can't just use a direct branch to slb_miss_realmode @@ -803,7 +803,7 @@ data_access_slb_relon_pSeries: */ mfctr r11 ld r10,PACAKBASE(r13) - LOAD_HANDLER(r10, slb_miss_realmode) + LOAD_HANDLER(r10, handle_slb_miss_relon) mtctr r10 bctr #endif @@ -819,11 +819,11 @@ instruction_access_slb_relon_pSeries: mfspr r3,SPRN_SRR0/* SRR0 is faulting address */ mfspr r12,SPRN_SRR1 #ifndef CONFIG_RELOCATABLE - b slb_miss_realmode + b handle_slb_miss_relon #else mfctr r11 ld r10,PACAKBASE(r13) - LOAD_HANDLER(r10, slb_miss_realmode) + LOAD_HANDLER(r10, handle_slb_miss_relon) mtctr r10 bctr #endif @@ -961,7 +961,23 @@ h_data_storage_common: bl unknown_exception b ret_from_except +/* r3 point to DAR */ .align 7 + .globl slb_miss_user +slb_miss_user: + std r3,PACA_EXSLB+EX_DAR(r13) + /* Restore r3 as expected by PROLOG_COMMON below */ + ld r3,PACA_EXSLB+EX_R3(r13) + EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB) + RECONCILE_IRQ_STATE(r10, r11) + ld r4,PACA_EXSLB+EX_DAR(r13) + li r5,0x380 + std r4,_DAR(r1) + addir3,r1,STACK_FRAME_OVERHEAD + bl handle_slb_miss + b ret_from_except_lite + +.align 7 .globl instruction_access_common instruction_access_common: EXCEPTION_PROLOG_COMMON(0x400, PACA_EXGEN) @@ -1379,11 +1395,17 @@ unrecover_mce: * We assume we aren't going to take any exceptions during this procedure. */ slb_miss_realmode: - mflrr10 #ifdef CONFIG_RELOCATABLE mtctr r11 #endif + /* +* Handle user slb miss with translation enabled +*/ + cmpdi r3,0 + bge 3f +slb_miss_kernel: + mflrr10 stw r9,PACA_EXSLB+EX_CCR(r13) /* save CR in exc. frame */ std r10,PACA_EXSLB+EX_LR(r13) /* save LR */ @@ -1428,6 +1450,29 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) mtspr SPRN_SRR1,r10 rfid b . +3: + /* +* Enable IR/DR and handle the fault +*/ + EXCEPTION_PROLOG_PSERIES_1(slb_miss_user, EXC_STD) + /* +* handler with relocation on +*/ +handle_slb_miss_relon: +#ifdef CONFIG_RELOCATABLE + mtctr r11 +#endif + /* +* Handle user slb miss with stack initialized. +*/ + cmpdi r3,0 + bge 4f + /* +* go back to slb_miss_realmode +*/ + b slb_miss_kernel +4: + EXCEPTION_RELON_PROLOG_PSERIES_1(slb_miss_user, EXC_STD) unrecov_slb: EXCEPTION_PROLOG_COMMON(0x4100, PACA_EXSLB) diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index 48fc28bab544..b18d7df5601d 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -25,6 +25,8 @@ #include #include +#include + enum slb_index { LINEAR_INDEX= 0, /* Kernel linear map (0xc000) */ VMALLOC_INDEX = 1, /* Kernel virtual map (0xd000) */ @@ -346,3 +348,12 @@ void slb_initialize(void) asm volatile("isync":::"memory"); } + +void handle_slb_miss(struct pt_regs *regs, +unsigned long address, unsigned long trap) +{ + enum ctx_state prev_state = exception_enter(); + + slb_allocate(address); + exception_exit(prev_state); +} -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/nohash: Fix build break with 4K pages
Michael Ellerman writes: > Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are > allocating fragments" renamed page_table_free() to pte_fragment_free(). > One occurrence was mistyped as pte_fragment_fre(). > > This only breaks the nohash 4K page build, which is not the default or > enabled in any defconfig. Can you share the .config. I will add it to the build test. > > Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are > allocating fragments") > Signed-off-by: Michael Ellerman Reviewed-by: Aneesh Kumar K.V > --- > arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h > b/arch/powerpc/include/asm/nohash/64/pgalloc.h > index 0c12a3bfe2ab..069369f6414b 100644 > --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h > +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h > @@ -172,7 +172,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct > *mm, > > static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) > { > - pte_fragment_fre((unsigned long *)pte, 1); > + pte_fragment_free((unsigned long *)pte, 1); > } > > static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) > -- > 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 16/16] powerpc/mm: Support segment table for Power9
"Aneesh Kumar K.V" writes: > PowerISA 3.0 adds an in memory table for storing segment translation > information. In this mode, which is enabled by setting both HOST RADIX > and GUEST RADIX bits in partition table to 0 and enabling UPRT to > 1, we have a per process segment table. The segment table details > are stored in the process table indexed by PID value. > > Segment table mode also requires us to map the process table at the > beginning of a 1TB segment. > > On the linux kernel side we enable this model if we find that > the radix is explicitily disabled by setting the ibm,pa-feature radix > bit (byte 40 bit 0) set to 0. If the size of ibm,pa-feature node is less > than 40 bytes, we enable the legacy HPT mode using SLB. If radix bit > is set to 1, we use the radix mode. Missed updating the commit message. On the linux kernel side we enable this model if we find hash mmu bit (byte 58 bit 0) of ibm,pa-feature device tree node set to 1. If the size of ibm,pa-feature node is less than 58 bytes or if the hash mmu bit is set to 0, we enable the legacy HPT mode using SLB. If radix bit (byte 40 bit 0) is set to 1, we use the radix mode. > > With respect to SLB mapping, we bolt mapp the entire kernel range and > and only handle user space segment fault. > -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 16/16] powerpc/mm: Support segment table for Power9
PowerISA 3.0 adds an in memory table for storing segment translation information. In this mode, which is enabled by setting both HOST RADIX and GUEST RADIX bits in partition table to 0 and enabling UPRT to 1, we have a per process segment table. The segment table details are stored in the process table indexed by PID value. Segment table mode also requires us to map the process table at the beginning of a 1TB segment. On the linux kernel side we enable this model if we find that the radix is explicitily disabled by setting the ibm,pa-feature radix bit (byte 40 bit 0) set to 0. If the size of ibm,pa-feature node is less than 40 bytes, we enable the legacy HPT mode using SLB. If radix bit is set to 1, we use the radix mode. With respect to SLB mapping, we bolt mapp the entire kernel range and and only handle user space segment fault. We also have access to 4 SLB register in software. So we continue to use 3 of that for bolted kernel SLB entries as we use them currently. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hash.h | 10 + arch/powerpc/include/asm/book3s/64/mmu-hash.h | 17 ++ arch/powerpc/include/asm/book3s/64/mmu.h | 4 + arch/powerpc/include/asm/mmu.h| 6 +- arch/powerpc/include/asm/mmu_context.h| 5 +- arch/powerpc/kernel/prom.c| 1 + arch/powerpc/mm/hash_utils_64.c | 86 ++- arch/powerpc/mm/mmu_context_book3s64.c| 32 ++- arch/powerpc/mm/slb.c | 350 +- 9 files changed, 493 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index f61cad3de4e6..5f0deeda7884 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -58,6 +58,16 @@ #define H_VMALLOC_END (H_VMALLOC_START + H_VMALLOC_SIZE) /* + * Process table with ISA 3.0 need to be mapped at the beginning of a 1TB segment + * We put that in the top of VMALLOC region. For each region we can go upto 64TB + * for now. Hence we have space to put process table there. We should not get + * an SLB miss for this address, because the VSID for this is placed in the + * partition table. + */ +#define H_SEG_PROC_TBL_START ASM_CONST(0xD0002000) +#define H_SEG_PROC_TBL_END ASM_CONST(0xD00020ff) + +/* * Region IDs */ #define REGION_SHIFT 60UL diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index b042e5f9a428..5f9ee699da5f 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -102,6 +102,18 @@ #define HPTE_V_1TB_SEG ASM_CONST(0x4000) #define HPTE_V_VRMA_MASK ASM_CONST(0x4001ff00) +/* segment table entry masks/bits */ +/* Upper 64 bit */ +#define STE_VALID ASM_CONST(0x800) +/* + * lower 64 bit + * 64th bit become 0 bit + */ +/* + * Software defined bolted bit + */ +#define STE_BOLTED ASM_CONST(0x1) + /* Values for PP (assumes Ks=0, Kp=1) */ #define PP_RWXX0 /* Supervisor read/write, User none */ #define PP_RWRX 1 /* Supervisor read/write, User read */ @@ -129,6 +141,11 @@ struct hash_pte { __be64 r; }; +struct seg_entry { + __be64 ste_e; + __be64 ste_v; +}; + extern struct hash_pte *htab_address; extern unsigned long htab_size_bytes; extern unsigned long htab_hash_mask; diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 6d8306d9aa7a..b7514f19863f 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -65,6 +65,7 @@ extern struct patb_entry *partition_tb; */ #define PATB_SIZE_SHIFT16 +extern unsigned long segment_table_initialize(struct prtb_entry *prtb); typedef unsigned long mm_context_id_t; struct spinlock; @@ -94,6 +95,9 @@ typedef struct { #ifdef CONFIG_SPAPR_TCE_IOMMU struct list_head iommu_group_mem_list; #endif + unsigned long seg_table; + struct spinlock *seg_tbl_lock; + } mm_context_t; /* diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 21b71469e66b..97446e8cc101 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -24,6 +24,10 @@ * Radix page table available */ #define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040) + +/* Seg table only supported for book3s 64 */ +#define MMU_FTR_TYPE_SEG_TABLE ASM_CONST(0x0080) + /* * individual features */ @@ -130,7 +134,7 @@ enum { MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE | MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA | #ifdef CONFIG_PPC_RADIX_MMU - MMU_FTR_TYPE_RADIX | + MMU_FTR_TYPE_RADIX | MMU_FTR_TYPE_SEG_TABLE | #endif 0, }; diff --git a/arch/powerp
Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Darren Stevens writes: > Hello Christian > That's not where I ended up with my bisect, this commit is about 10 before the > one I found to be bad, which is: > > commit d6a9996e84ac4beb7713e9485f4563e100a9b03e > Author: Aneesh Kumar K.V > Date: Fri Apr 29 23:26:21 2016 +1000 > > powerpc/mm: vmalloc abstraction in preparation for radix > > The vmalloc range differs between hash and radix config. Hence make > VMALLOC_START and related constants a variable which will be runtime > initialized depending on whether hash or radix mode is active. > > Signed-off-by: Aneesh Kumar K.V > [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] > Signed-off-by: Michael Ellerman > Can you check the value of ISA_IO_BASE where you are using it. If you are calling it early, you will find wrong value in that. With the latest kernel it is a variable and is initialized in hash__early_init_mmu(); -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE
Hi Aneesh, We use it only in the file "pci-common.c". Part of the Nemo patch with ISA_IO_BASE: diff -rupN linux-4.7/arch/powerpc/kernel/pci-common.c linux-4.7-nemo/arch/powerpc/kernel/pci-common.c --- linux-4.7/arch/powerpc/kernel/pci-common.c2016-05-20 10:23:06.588299920 +0200 +++ linux-4.7-nemo/arch/powerpc/kernel/pci-common.c2016-05-20 10:21:28.652296699 +0200 @@ -723,6 +723,19 @@ void pci_process_bridge_OF_ranges(struct isa_io_base = (unsigned long)hose->io_base_virt; #endif /* CONFIG_PPC32 */ + + +#ifdef CONFIG_PPC_PASEMI_SB600 + /* Workaround for lack of device tree. New for kernel 3.17: range.cpu_addr instead of cpu_addr and range.size instead of size Ch. Zigotzky */ + if (primary) { + __ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE, + range.size, pgprot_val(pgprot_noncached(__pgprot(0; + hose->io_base_virt = (void *)_IO_BASE; + /* _IO_BASE needs unsigned long long for the kernel 3.17 Ch. Zigotzky */ + printk("Initialised io_base_virt 0x%lx _IO_BASE 0x%llx\n", (unsigned long)hose->io_base_virt, (unsigned long long)_IO_BASE); +} +#endif + Cheers, Christian On 08 June 2016 at 5:11 PM, Aneesh Kumar K.V wrote: Can you check the value of ISA_IO_BASE where you are using it. If you are calling it early, you will find wrong value in that. With the latest kernel it is a variable and is initialized in hash__early_init_mmu(); -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 00/11] powerpc/powernv/cpuidle: Add support for POWER ISA v3 idle states
POWER ISA v3 defines a new idle processor core mechanism. In summary, a) new instruction named stop is added. This instruction replaces instructions like nap, sleep, rvwinkle. b) new per thread SPR named PSSCR is added which controls the behavior of stop instruction. PSSCR has following key fields Bits 0:3 - Power-Saving Level Status. This field indicates the lowest power-saving state the thread entered since stop instruction was last executed. Bit 42 - Enable State Loss 0 - No state is lost irrespective of other fields 1 - Allows state loss Bits 44:47 - Power-Saving Level Limit This limits the power-saving level that can be entered into. Bits 60:63 - Requested Level Used to specify which power-saving level must be entered on executing stop instruction Stop idle states and their properties like name, latency, target residency, psscr value are exposed via device tree. This patch series adds support for this new mechanism. Patches 1-7 are cleanups and code movement. Patch 8 adds platform specific support for stop and psscr handling. Patch 9 is a minor cleanup in cpuidle driver. Patch 10 adds cpuidle driver support. Patch 11 makes offlined cpu use deepest stop state. Note: Documentation for the device tree bindings is posted here- http://patchwork.ozlabs.org/patch/629125/ Changes in v6 = - Restore new POWER ISA v3 SPRS when waking up from deep idle Changes in v5 = - Use generic cpuidle constant CPUIDLE_NAME_LEN - Fix return code handling for of_property_read_string_array - Use DT flags to determine if are using stop instruction, instead of cpu_has_feature - Removed uncessary cast with names - &stop_loop -> stop_loop - Added POWERNV_THRESHOLD_LATENCY_NS to filter out idle states with high latency Changes in v4 = - Added a patch to use PNV_THREAD_WINKLE macro while requesting for winkle - Moved power7_powersave_common rename to more appropriate patch - renaming power7_enter_nap_mode to pnv_enter_arch207_idle_mode - Added PSSCR layout to Patch 7's commit message - Improved / Fixed comments - Fixed whitespace error in paca.h - Using MAX_POSSIBLE_STOP_STATE macro instead of hardcoding 0xF has max possible stop state Changes in v3 = - Rebased on powerpc-next - Dropping patch 1 since we are not adding a new file for P9 idle support - Improved comments in multiple places - Moved GET_PACA from power7_restore_hyp_resource to System Reset - Instead of moving few functions from idle_power7 to idle_power_common, renaming idle_power7.S to idle_power_common.S - Moved HSTATE_HWTHREAD_STATE updation to power_powersave_common - Dropped earlier patch 5 which moved few macros from idle_power_common to asm/cpuidle.h. - Added a patch to rename reusable power7_* idle functions to pnv_* - Added new patch that creates abstraction for saving SPRs before entering deep idle states - Instead of introducing new file idle_power_stop.S, P9 idle support is added to idle_power_common.S using CPU_FTR sections. - Fixed r4 reg clobbering in power_stop0 Changes in v2 = - Rebased on v4.6-rc6 - Using CPU_FTR_ARCH_300 bit instead of CPU_FTR_STOP_INST Cc: Rafael J. Wysocki Cc: Daniel Lezcano Cc: linux...@vger.kernel.org Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Paul Mackerras Cc: Michael Neuling Cc: linuxppc-dev@lists.ozlabs.org Cc: Rob Herring Cc: Lorenzo Pieralisi Shreyas B. Prabhu (11): powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for winkle powerpc/kvm: make hypervisor state restore a function powerpc/powernv: Rename idle_power7.S to idle_power_common.S powerpc/powernv: Rename reusable idle functions to hardware agnostic names powerpc/powernv: Make pnv_powersave_common more generic powerpc/powernv: abstraction for saving SPRs before entering deep idle states powerpc/powernv: set power_save func after the idle states are initialized powerpc/powernv: Add platform support for stop instruction cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES cpuidle/powernv: Add support for POWER ISA v3 idle states powerpc/powernv: Use deepest stop state when cpu is offlined arch/powerpc/include/asm/cpuidle.h| 2 + arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +- arch/powerpc/include/asm/machdep.h| 1 + arch/powerpc/include/asm/opal-api.h | 11 +- arch/powerpc/include/asm/paca.h | 2 + arch/powerpc/include/asm/ppc-opcode.h | 4 + arch/powerpc/include/asm/processor.h | 1 + arch/powerpc/include/asm/reg.h| 14 + arch/powerpc/kernel/Makefile | 2 +- arch/powerpc/kernel/asm-offsets.c | 2 + arch/powerpc/kernel/exceptions-64s.
[PATCH v6 02/11] powerpc/kvm: make hypervisor state restore a function
In the current code, when the thread wakes up in reset vector, some of the state restore code and check for whether a thread needs to branch to kvm is duplicated. Reorder the code such that this duplication is avoided. At a higher level this is what the change looks like- Before this patch - power7_wakeup_tb_loss: restore hypervisor state if (thread needed by kvm) goto kvm_start_guest restore nvgprs, cr, pc rfid to process context power7_wakeup_loss: restore nvgprs, cr, pc rfid to process context reset vector: if (waking from deep idle states) goto power7_wakeup_tb_loss else if (thread needed by kvm) goto kvm_start_guest goto power7_wakeup_loss After this patch - power7_wakeup_tb_loss: restore hypervisor state return power7_restore_hyp_resource(): if (waking from deep idle states) goto power7_wakeup_tb_loss return power7_wakeup_loss: restore nvgprs, cr, pc rfid to process context reset vector: power7_restore_hyp_resource() if (thread needed by kvm) goto kvm_start_guest goto power7_wakeup_loss Reviewed-by: Paul Mackerras Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v3 Changes in v3: = - Retaining GET_PACA(r13) in System Reset vector instead of moving it to power7_restore_hyp_resource - Added comments indicating entry conditions for power7_restore_hyp_resource - Improved comments around return statements arch/powerpc/kernel/exceptions-64s.S | 28 ++ arch/powerpc/kernel/idle_power7.S| 72 +--- 2 files changed, 46 insertions(+), 54 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4c94406..4a74d6a 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -107,25 +107,9 @@ BEGIN_FTR_SECTION beq 9f cmpwi cr3,r13,2 - - /* -* Check if last bit of HSPGR0 is set. This indicates whether we are -* waking up from winkle. -*/ GET_PACA(r13) - clrldi r5,r13,63 - clrrdi r13,r13,1 - cmpwi cr4,r5,1 - mtspr SPRN_HSPRG0,r13 + bl power7_restore_hyp_resource - lbz r0,PACA_THREAD_IDLE_STATE(r13) - cmpwi cr2,r0,PNV_THREAD_NAP - bgt cr2,8f /* Either sleep or Winkle */ - - /* Waking up from nap should not cause hypervisor state loss */ - bgt cr3,. - - /* Waking up from nap */ li r0,PNV_THREAD_RUNNING stb r0,PACA_THREAD_IDLE_STATE(r13) /* Clear thread state */ @@ -143,13 +127,9 @@ BEGIN_FTR_SECTION /* Return SRR1 from power7_nap() */ mfspr r3,SPRN_SRR1 - beq cr3,2f - b power7_wakeup_noloss -2: b power7_wakeup_loss - - /* Fast Sleep wakeup on PowerNV */ -8: GET_PACA(r13) - b power7_wakeup_tb_loss + blt cr3,2f + b power7_wakeup_loss +2: b power7_wakeup_noloss 9: END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S index 705c867..d5def06 100644 --- a/arch/powerpc/kernel/idle_power7.S +++ b/arch/powerpc/kernel/idle_power7.S @@ -276,6 +276,39 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \ 20:nop; +/* + * Called from reset vector. Check whether we have woken up with + * hypervisor state loss. If yes, restore hypervisor state and return + * back to reset vector. + * + * r13 - Contents of HSPRG0 + * cr3 - set to gt if waking up with partial/complete hypervisor state loss + */ +_GLOBAL(power7_restore_hyp_resource) + /* +* Check if last bit of HSPGR0 is set. This indicates whether we are +* waking up from winkle. +*/ + clrldi r5,r13,63 + clrrdi r13,r13,1 + cmpwi cr4,r5,1 + mtspr SPRN_HSPRG0,r13 + + lbz r0,PACA_THREAD_IDLE_STATE(r13) + cmpwi cr2,r0,PNV_THREAD_NAP + bgt cr2,power7_wakeup_tb_loss /* Either sleep or Winkle */ + + /* +* We fall through here if PACA_THREAD_IDLE_STATE shows we are waking +* up from nap. At this stage CR3 shouldn't contains 'gt' since that +* indicates we are waking with hypervisor state loss from nap. +*/ + bgt cr3,. + + blr /* Return back to System Reset vector from where + power7_restore_hyp_resource was invoked */ + + _GLOBAL(power7_wakeup_tb_loss) ld r2,PACATOC(r13); ld r1,PACAR1(r13) @@ -284,11 +317,13 @@ _GLOBAL(power7_wakeup_tb_loss) * and they are restored before switching to the process context. Hence
[PATCH v6 04/11] powerpc/powernv: Rename reusable idle functions to hardware agnostic names
Functions like power7_wakeup_loss, power7_wakeup_noloss, power7_wakeup_tb_loss are used by POWER7 and POWER8 hardware. They can also be used by POWER9. Hence rename these functions hardware agnostic names. Suggested-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v4 Changes in v4: == - renaming power7_powersave_common to pnv_powersave_common - renaming power7_enter_nap_mode to pnv_enter_arch207_idle_mode arch/powerpc/kernel/exceptions-64s.S| 8 arch/powerpc/kernel/idle_power_common.S | 33 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 ++-- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4a74d6a..2a123cd 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -108,7 +108,7 @@ BEGIN_FTR_SECTION cmpwi cr3,r13,2 GET_PACA(r13) - bl power7_restore_hyp_resource + bl pnv_restore_hyp_resource li r0,PNV_THREAD_RUNNING stb r0,PACA_THREAD_IDLE_STATE(r13) /* Clear thread state */ @@ -128,8 +128,8 @@ BEGIN_FTR_SECTION /* Return SRR1 from power7_nap() */ mfspr r3,SPRN_SRR1 blt cr3,2f - b power7_wakeup_loss -2: b power7_wakeup_noloss + b pnv_wakeup_loss +2: b pnv_wakeup_noloss 9: END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) @@ -1269,7 +1269,7 @@ machine_check_handle_early: GET_PACA(r13) ld r1,PACAR1(r13) li r3,PNV_THREAD_NAP - b power7_enter_nap_mode + b pnv_enter_arch207_idle_mode 4: #endif /* diff --git a/arch/powerpc/kernel/idle_power_common.S b/arch/powerpc/kernel/idle_power_common.S index d5def06..34dbfc9 100644 --- a/arch/powerpc/kernel/idle_power_common.S +++ b/arch/powerpc/kernel/idle_power_common.S @@ -1,5 +1,6 @@ /* - * This file contains the power_save function for Power7 CPUs. + * This file contains idle entry/exit functions for POWER7 and + * POWER8 CPUs. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -75,7 +76,7 @@ core_idle_lock_held: * 0 - don't check * 1 - check */ -_GLOBAL(power7_powersave_common) +_GLOBAL(pnv_powersave_common) /* Use r3 to pass state nap/sleep/winkle */ /* NAP is a state loss, we create a regs frame on the * stack, fill it up with the state we care about and @@ -135,14 +136,14 @@ _GLOBAL(power7_powersave_common) LOAD_REG_IMMEDIATE(r5, MSR_IDLE) li r6, MSR_RI andcr6, r9, r6 - LOAD_REG_ADDR(r7, power7_enter_nap_mode) + LOAD_REG_ADDR(r7, pnv_enter_arch207_idle_mode) mtmsrd r6, 1 /* clear RI before setting SRR0/1 */ mtspr SPRN_SRR0, r7 mtspr SPRN_SRR1, r5 rfid - .globl power7_enter_nap_mode -power7_enter_nap_mode: + .globl pnv_enter_arch207_idle_mode +pnv_enter_arch207_idle_mode: #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE /* Tell KVM we're napping */ li r4,KVM_HWTHREAD_IN_NAP @@ -242,19 +243,19 @@ _GLOBAL(power7_idle) _GLOBAL(power7_nap) mr r4,r3 li r3,PNV_THREAD_NAP - b power7_powersave_common + b pnv_powersave_common /* No return */ _GLOBAL(power7_sleep) li r3,PNV_THREAD_SLEEP li r4,1 - b power7_powersave_common + b pnv_powersave_common /* No return */ _GLOBAL(power7_winkle) li r3,PNV_THREAD_WINKLE li r4,1 - b power7_powersave_common + b pnv_powersave_common /* No return */ #define CHECK_HMI_INTERRUPT\ @@ -284,7 +285,7 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \ * r13 - Contents of HSPRG0 * cr3 - set to gt if waking up with partial/complete hypervisor state loss */ -_GLOBAL(power7_restore_hyp_resource) +_GLOBAL(pnv_restore_hyp_resource) /* * Check if last bit of HSPGR0 is set. This indicates whether we are * waking up from winkle. @@ -296,7 +297,7 @@ _GLOBAL(power7_restore_hyp_resource) lbz r0,PACA_THREAD_IDLE_STATE(r13) cmpwi cr2,r0,PNV_THREAD_NAP - bgt cr2,power7_wakeup_tb_loss /* Either sleep or Winkle */ + bgt cr2,pnv_wakeup_tb_loss /* Either sleep or Winkle */ /* * We fall through here if PACA_THREAD_IDLE_STATE shows we are waking @@ -306,10 +307,10 @@ _GLOBAL(power7_restore_hyp_resource) bgt cr3,. blr /* Return back to System Reset vector from where - power7_restore_hyp_resource was invoked */ + pnv_restore_hyp_resource was invoked */ -_GLOBAL(power7_wake
[PATCH v6 03/11] powerpc/powernv: Rename idle_power7.S to idle_power_common.S
idle_power7.S handles idle entry/exit for POWER7, POWER8 and in next patch for POWER9. Rename the file to a non-hardware specific name. Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v3 Changes in v3: == - Instead of moving few common functions from idle_power7.S to idle_power_common.S, renaming idle_power7.S to idle_power_common.S arch/powerpc/kernel/Makefile| 2 +- arch/powerpc/kernel/idle_power7.S | 527 arch/powerpc/kernel/idle_power_common.S | 527 3 files changed, 528 insertions(+), 528 deletions(-) delete mode 100644 arch/powerpc/kernel/idle_power7.S create mode 100644 arch/powerpc/kernel/idle_power_common.S diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 2da380f..99116da 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -47,7 +47,7 @@ obj-$(CONFIG_PPC_BOOK3E_64) += exceptions-64e.o idle_book3e.o obj-$(CONFIG_PPC64)+= vdso64/ obj-$(CONFIG_ALTIVEC) += vecemu.o obj-$(CONFIG_PPC_970_NAP) += idle_power4.o -obj-$(CONFIG_PPC_P7_NAP) += idle_power7.o +obj-$(CONFIG_PPC_P7_NAP) += idle_power_common.o procfs-y := proc_powerpc.o obj-$(CONFIG_PROC_FS) += $(procfs-y) rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI) := rtas_pci.o diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S deleted file mode 100644 index d5def06..000 --- a/arch/powerpc/kernel/idle_power7.S +++ /dev/null @@ -1,527 +0,0 @@ -/* - * This file contains the power_save function for Power7 CPUs. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#undef DEBUG - -/* - * Use unused space in the interrupt stack to save and restore - * registers for winkle support. - */ -#define _SDR1 GPR3 -#define _RPR GPR4 -#define _SPURR GPR5 -#define _PURR GPR6 -#define _TSCR GPR7 -#define _DSCR GPR8 -#define _AMOR GPR9 -#define _WORT GPR10 -#define _WORC GPR11 - -/* Idle state entry routines */ - -#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \ - /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ - std r0,0(r1); \ - ptesync;\ - ld r0,0(r1); \ -1: cmp cr0,r0,r0; \ - bne 1b; \ - IDLE_INST; \ - b . - - .text - -/* - * Used by threads when the lock bit of core_idle_state is set. - * Threads will spin in HMT_LOW until the lock bit is cleared. - * r14 - pointer to core_idle_state - * r15 - used to load contents of core_idle_state - */ - -core_idle_lock_held: - HMT_LOW -3: lwz r15,0(r14) - andi. r15,r15,PNV_CORE_IDLE_LOCK_BIT - bne 3b - HMT_MEDIUM - lwarx r15,0,r14 - blr - -/* - * Pass requested state in r3: - * r3 - PNV_THREAD_NAP/SLEEP/WINKLE - * - * To check IRQ_HAPPENED in r4 - * 0 - don't check - * 1 - check - */ -_GLOBAL(power7_powersave_common) - /* Use r3 to pass state nap/sleep/winkle */ - /* NAP is a state loss, we create a regs frame on the -* stack, fill it up with the state we care about and -* stick a pointer to it in PACAR1. We really only -* need to save PC, some CR bits and the NV GPRs, -* but for now an interrupt frame will do. -*/ - mflrr0 - std r0,16(r1) - stdur1,-INT_FRAME_SIZE(r1) - std r0,_LINK(r1) - std r0,_NIP(r1) - - /* Hard disable interrupts */ - mfmsr r9 - rldicl r9,r9,48,1 - rotldi r9,r9,16 - mtmsrd r9,1/* hard-disable interrupts */ - - /* Check if something happened while soft-disabled */ - lbz r0,PACAIRQHAPPENED(r13) - andi. r0,r0,~PACA_IRQ_HARD_DIS@l - beq 1f - cmpwi cr0,r4,0 - beq 1f - addir1,r1,INT_FRAME_SIZE - ld r0,16(r1) - li r3,0/* Return 0 (no nap) */ - mtlrr0 - blr - -1: /* We mark irqs hard disabled as this is the state we'll -* be in when returning and we need to tell arch_local_irq_restore() -* about it -*/ - li r0,PACA_IRQ_HARD_DIS - stb r0,PACAIRQHAPPENED(r13) - - /* We haven't lost state ... yet */ -
[PATCH v6 01/11] powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for winkle
Signed-off-by: Shreyas B. Prabhu --- -No changes since v4 Changes in v4 = - New in v4 arch/powerpc/kernel/idle_power7.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S index 470ceeb..705c867 100644 --- a/arch/powerpc/kernel/idle_power7.S +++ b/arch/powerpc/kernel/idle_power7.S @@ -252,7 +252,7 @@ _GLOBAL(power7_sleep) /* No return */ _GLOBAL(power7_winkle) - li r3,3 + li r3,PNV_THREAD_WINKLE li r4,1 b power7_powersave_common /* No return */ -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 07/11] powerpc/powernv: set power_save func after the idle states are initialized
pnv_init_idle_states discovers supported idle states from the device tree and does the required initialization. Set power_save function pointer only after this initialization is done Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v1 arch/powerpc/platforms/powernv/idle.c | 3 +++ arch/powerpc/platforms/powernv/setup.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index fcc8b68..fbb09fb 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -285,6 +285,9 @@ static int __init pnv_init_idle_states(void) } pnv_alloc_idle_core_states(); + + if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED) + ppc_md.power_save = power7_idle; out_free: kfree(flags); out: diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index ee6430b..8492bbb 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -315,7 +315,7 @@ define_machine(powernv) { .get_proc_freq = pnv_get_proc_freq, .progress = pnv_progress, .machine_shutdown = pnv_shutdown, - .power_save = power7_idle, + .power_save = NULL, .calibrate_decr = generic_calibrate_decr, #ifdef CONFIG_KEXEC .kexec_cpu_down = pnv_kexec_cpu_down, -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 06/11] powerpc/powernv: abstraction for saving SPRs before entering deep idle states
Create a function for saving SPRs before entering deep idle states. This function can be reused for POWER9 deep idle states. Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v3 Changes in v3: = - Newly added in v3 arch/powerpc/kernel/idle_power_common.S | 54 +++-- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kernel/idle_power_common.S b/arch/powerpc/kernel/idle_power_common.S index a8397e3..2f909a1 100644 --- a/arch/powerpc/kernel/idle_power_common.S +++ b/arch/powerpc/kernel/idle_power_common.S @@ -53,6 +53,36 @@ .text /* + * Used by threads before entering deep idle states. Saves SPRs + * in interrupt stack frame + */ +save_sprs_to_stack: + /* +* Note all register i.e per-core, per-subcore or per-thread is saved +* here since any thread in the core might wake up first +*/ + mfspr r3,SPRN_SDR1 + std r3,_SDR1(r1) + mfspr r3,SPRN_RPR + std r3,_RPR(r1) + mfspr r3,SPRN_SPURR + std r3,_SPURR(r1) + mfspr r3,SPRN_PURR + std r3,_PURR(r1) + mfspr r3,SPRN_TSCR + std r3,_TSCR(r1) + mfspr r3,SPRN_DSCR + std r3,_DSCR(r1) + mfspr r3,SPRN_AMOR + std r3,_AMOR(r1) + mfspr r3,SPRN_WORT + std r3,_WORT(r1) + mfspr r3,SPRN_WORC + std r3,_WORC(r1) + + blr + +/* * Used by threads when the lock bit of core_idle_state is set. * Threads will spin in HMT_LOW until the lock bit is cleared. * r14 - pointer to core_idle_state @@ -209,28 +239,8 @@ fastsleep_workaround_at_entry: b common_enter enter_winkle: - /* -* Note all register i.e per-core, per-subcore or per-thread is saved -* here since any thread in the core might wake up first -*/ - mfspr r3,SPRN_SDR1 - std r3,_SDR1(r1) - mfspr r3,SPRN_RPR - std r3,_RPR(r1) - mfspr r3,SPRN_SPURR - std r3,_SPURR(r1) - mfspr r3,SPRN_PURR - std r3,_PURR(r1) - mfspr r3,SPRN_TSCR - std r3,_TSCR(r1) - mfspr r3,SPRN_DSCR - std r3,_DSCR(r1) - mfspr r3,SPRN_AMOR - std r3,_AMOR(r1) - mfspr r3,SPRN_WORT - std r3,_WORT(r1) - mfspr r3,SPRN_WORC - std r3,_WORC(r1) + bl save_sprs_to_stack + IDLE_STATE_ENTER_SEQ(PPC_WINKLE) _GLOBAL(power7_idle) -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 08/11] powerpc/powernv: Add platform support for stop instruction
POWER ISA v3 defines a new idle processor core mechanism. In summary, a) new instruction named stop is added. This instruction replaces instructions like nap, sleep, rvwinkle. b) new per thread SPR named Processor Stop Status and Control Register (PSSCR) is added which controls the behavior of stop instruction. PSSCR layout: -- | PLS | /// | SD | ESL | EC | PSLL | /// | TR | MTL | RL | -- 0 4 41 4243 44 4854 5660 PSSCR key fields: Bits 0:3 - Power-Saving Level Status. This field indicates the lowest power-saving state the thread entered since stop instruction was last executed. Bit 42 - Enable State Loss 0 - No state is lost irrespective of other fields 1 - Allows state loss Bits 44:47 - Power-Saving Level Limit This limits the power-saving level that can be entered into. Bits 60:63 - Requested Level Used to specify which power-saving level must be entered on executing stop instruction This patch adds support for stop instruction and PSSCR handling. Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- Changes in v6 = - Save/restore new P9 SPRs when using deep idle states Changes in v4: == - Added PSSCR layout to commit message - Improved / Fixed comments - Fixed whitespace error in paca.h - Using MAX_POSSIBLE_STOP_STATE macro instead of hardcoding 0xF as max possible stop state Changes in v3: == - Instead of introducing new file idle_power_stop.S, P9 idle support is added to idle_power_common.S using CPU_FTR sections. - Fixed r4 reg clobbering in power_stop0 - Improved comments Changes in v2: == - Using CPU_FTR_ARCH_300 bit instead of CPU_FTR_STOP_INST arch/powerpc/include/asm/cpuidle.h| 2 + arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +- arch/powerpc/include/asm/machdep.h| 1 + arch/powerpc/include/asm/opal-api.h | 11 +- arch/powerpc/include/asm/paca.h | 2 + arch/powerpc/include/asm/ppc-opcode.h | 4 + arch/powerpc/include/asm/processor.h | 1 + arch/powerpc/include/asm/reg.h| 14 +++ arch/powerpc/kernel/asm-offsets.c | 2 + arch/powerpc/kernel/idle_power_common.S | 175 +++--- arch/powerpc/platforms/powernv/idle.c | 84 -- 11 files changed, 265 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index d2f99ca..3d7fc06 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -13,6 +13,8 @@ #ifndef __ASSEMBLY__ extern u32 pnv_fastsleep_workaround_at_entry[]; extern u32 pnv_fastsleep_workaround_at_exit[]; + +extern u64 pnv_first_deep_stop_state; #endif #endif diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h index 72b6225..d318d43 100644 --- a/arch/powerpc/include/asm/kvm_book3s_asm.h +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h @@ -162,7 +162,7 @@ struct kvmppc_book3s_shadow_vcpu { /* Values for kvm_state */ #define KVM_HWTHREAD_IN_KERNEL 0 -#define KVM_HWTHREAD_IN_NAP1 +#define KVM_HWTHREAD_IN_IDLE 1 #define KVM_HWTHREAD_IN_KVM2 #endif /* __ASM_KVM_BOOK3S_ASM_H__ */ diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 6bdcd0d..ae3b155 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -262,6 +262,7 @@ struct machdep_calls { extern void e500_idle(void); extern void power4_idle(void); extern void power7_idle(void); +extern void power_stop0(void); extern void ppc6xx_idle(void); extern void book3e_idle(void); diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index 9bb8ddf..7f3f8c6 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -162,13 +162,20 @@ /* Device tree flags */ -/* Flags set in power-mgmt nodes in device tree if - * respective idle states are supported in the platform. +/* + * Flags set in power-mgmt nodes in device tree describing + * idle states that are supported in the platform. */ + +#define OPAL_PM_TIMEBASE_STOP 0x0002 +#define OPAL_PM_LOSE_HYP_CONTEXT 0x2000 +#define OPAL_PM_LOSE_FULL_CONTEXT 0x4000 #define OPAL_PM_NAP_ENABLED0x0001 #define OPAL_PM_SLEEP_ENABLED 0x0002 #define OPAL_PM_WINKLE_ENABLED 0x0004 #define OPAL_PM_SLEEP_ENABLED_ER1 0x0008 /* with workaround */ +#define OPAL_PM_STOP_INST_FAST 0x0010 +#define OPAL_PM_STOP_INST_DEEP 0x0020 /* * OPAL_CONFIG_CPU_IDLE_STATE parameters diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 5
[PATCH v6 05/11] powerpc/powernv: Make pnv_powersave_common more generic
pnv_powersave_common does common steps needed before entering idle state and eventually changes MSR to MSR_IDLE and does rfid to pnv_enter_arch207_idle_mode. Move the updation of HSTATE_HWTHREAD_STATE to pnv_powersave_common from pnv_enter_arch207_idle_mode and make it more generic by passing the rfid address as a function parameter. Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v4 Changes in v4: == - Moved renaming of power7_powersave_common to earlier patch Changes in v3: == - Moved HSTATE_HWTHREAD_STATE updation to power_powersave_common arch/powerpc/kernel/idle_power_common.S | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/idle_power_common.S b/arch/powerpc/kernel/idle_power_common.S index 34dbfc9..a8397e3 100644 --- a/arch/powerpc/kernel/idle_power_common.S +++ b/arch/powerpc/kernel/idle_power_common.S @@ -75,6 +75,8 @@ core_idle_lock_held: * To check IRQ_HAPPENED in r4 * 0 - don't check * 1 - check + * + * Address to 'rfid' to in r5 */ _GLOBAL(pnv_powersave_common) /* Use r3 to pass state nap/sleep/winkle */ @@ -127,28 +129,28 @@ _GLOBAL(pnv_powersave_common) std r9,_MSR(r1) std r1,PACAR1(r13) +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + /* Tell KVM we're entering idle */ + li r4,KVM_HWTHREAD_IN_NAP + stb r4,HSTATE_HWTHREAD_STATE(r13) +#endif + /* * Go to real mode to do the nap, as required by the architecture. * Also, we need to be in real mode before setting hwthread_state, * because as soon as we do that, another thread can switch * the MMU context to the guest. */ - LOAD_REG_IMMEDIATE(r5, MSR_IDLE) + LOAD_REG_IMMEDIATE(r7, MSR_IDLE) li r6, MSR_RI andcr6, r9, r6 - LOAD_REG_ADDR(r7, pnv_enter_arch207_idle_mode) mtmsrd r6, 1 /* clear RI before setting SRR0/1 */ - mtspr SPRN_SRR0, r7 - mtspr SPRN_SRR1, r5 + mtspr SPRN_SRR0, r5 + mtspr SPRN_SRR1, r7 rfid .globl pnv_enter_arch207_idle_mode pnv_enter_arch207_idle_mode: -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE - /* Tell KVM we're napping */ - li r4,KVM_HWTHREAD_IN_NAP - stb r4,HSTATE_HWTHREAD_STATE(r13) -#endif stb r3,PACA_THREAD_IDLE_STATE(r13) cmpwi cr3,r3,PNV_THREAD_SLEEP bge cr3,2f @@ -243,18 +245,21 @@ _GLOBAL(power7_idle) _GLOBAL(power7_nap) mr r4,r3 li r3,PNV_THREAD_NAP + LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode) b pnv_powersave_common /* No return */ _GLOBAL(power7_sleep) li r3,PNV_THREAD_SLEEP li r4,1 + LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode) b pnv_powersave_common /* No return */ _GLOBAL(power7_winkle) li r3,PNV_THREAD_WINKLE li r4,1 + LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode) b pnv_powersave_common /* No return */ -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 09/11] cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
Use cpuidle's CPUIDLE_STATE_MAX macro instead of powernv specific MAX_POWERNV_IDLE_STATES. Cc: Rafael J. Wysocki Cc: Daniel Lezcano Cc: linux...@vger.kernel.org Suggested-by: Daniel Lezcano Signed-off-by: Shreyas B. Prabhu --- - No changes after v5 Changes in v5 = - New in v5 drivers/cpuidle/cpuidle-powernv.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index e12dc30..3a763a8 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -20,8 +20,6 @@ #include #include -#define MAX_POWERNV_IDLE_STATES8 - struct cpuidle_driver powernv_idle_driver = { .name = "powernv_idle", .owner= THIS_MODULE, @@ -96,7 +94,7 @@ static int fastsleep_loop(struct cpuidle_device *dev, /* * States for dedicated partition case. */ -static struct cpuidle_state powernv_states[MAX_POWERNV_IDLE_STATES] = { +static struct cpuidle_state powernv_states[CPUIDLE_STATE_MAX] = { { /* Snooze */ .name = "snooze", .desc = "snooze", -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states
POWER ISA v3 defines a new idle processor core mechanism. In summary, a) new instruction named stop is added. b) new per thread SPR named PSSCR is added which controls the behavior of stop instruction. Supported idle states and value to be written to PSSCR register to enter any idle state is exposed via ibm,cpu-idle-state-names and ibm,cpu-idle-state-psscr respectively. To enter an idle state, platform provided power_stop() needs to be invoked with the appropriate PSSCR value. This patch adds support for this new mechanism in cpuidle powernv driver. Cc: Rafael J. Wysocki Cc: Daniel Lezcano Cc: Rob Herring Cc: Lorenzo Pieralisi Cc: linux...@vger.kernel.org Cc: Michael Ellerman Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- Note: Documentation for the device tree bindings is posted here- http://patchwork.ozlabs.org/patch/629125/ - No changes in v6 Changes in v5 = - Use generic cpuidle constant CPUIDLE_NAME_LEN - Fix return code handling for of_property_read_string_array - Use DT flags to determine if are using stop instruction, instead of cpu_has_feature - Removed uncessary cast with names - &stop_loop -> stop_loop - Added POWERNV_THRESHOLD_LATENCY_NS to filter out idle states with high latency drivers/cpuidle/cpuidle-powernv.c | 71 ++- 1 file changed, 70 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 3a763a8..c74a020 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -20,6 +20,8 @@ #include #include +#define POWERNV_THRESHOLD_LATENCY_NS 20 + struct cpuidle_driver powernv_idle_driver = { .name = "powernv_idle", .owner= THIS_MODULE, @@ -27,6 +29,9 @@ struct cpuidle_driver powernv_idle_driver = { static int max_idle_state; static struct cpuidle_state *cpuidle_state_table; + +static u64 stop_psscr_table[CPUIDLE_STATE_MAX]; + static u64 snooze_timeout; static bool snooze_timeout_en; @@ -91,6 +96,17 @@ static int fastsleep_loop(struct cpuidle_device *dev, return index; } #endif + +static int stop_loop(struct cpuidle_device *dev, +struct cpuidle_driver *drv, +int index) +{ + ppc64_runlatch_off(); + power_stop(stop_psscr_table[index]); + ppc64_runlatch_on(); + return index; +} + /* * States for dedicated partition case. */ @@ -167,6 +183,8 @@ static int powernv_add_idle_states(void) int nr_idle_states = 1; /* Snooze */ int dt_idle_states; u32 *latency_ns, *residency_ns, *flags; + u64 *psscr_val = NULL; + const char *names[CPUIDLE_STATE_MAX]; int i, rc; /* Currently we have snooze statically defined */ @@ -199,12 +217,41 @@ static int powernv_add_idle_states(void) goto out_free_latency; } + rc = of_property_read_string_array(power_mgt, + "ibm,cpu-idle-state-names", names, + dt_idle_states); + if (rc < 0) { + pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-names in DT\n"); + goto out_free_latency; + } + + /* +* If the idle states use stop instruction, probe for psscr values +* which are necessary to specify required stop level. +*/ + if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP)) { + psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val), + GFP_KERNEL); + rc = of_property_read_u64_array(power_mgt, + "ibm,cpu-idle-state-psscr", + psscr_val, dt_idle_states); + if (rc) { + pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n"); + goto out_free_psscr; + } + } residency_ns = kzalloc(sizeof(*residency_ns) * dt_idle_states, GFP_KERNEL); rc = of_property_read_u32_array(power_mgt, "ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states); for (i = 0; i < dt_idle_states; i++) { - + /* +* If an idle state has exit latency beyond +* POWERNV_THRESHOLD_LATENCY_NS then don't use it +* in cpu-idle. +*/ + if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS) + continue; /* * Cpuidle accepts exit_latency and target_residency in us. * Use default target_residency values if f/w does not expose it. @@ -216,6 +263,16 @@ static int powernv_add_idle_states(void) powernv_states[nr_idle_states].flags =
[PATCH v6 11/11] powerpc/powernv: Use deepest stop state when cpu is offlined
If hardware supports stop state, use the deepest stop state when the cpu is offlined. Reviewed-by: Gautham R. Shenoy Signed-off-by: Shreyas B. Prabhu --- - No changes since v1 arch/powerpc/platforms/powernv/idle.c| 15 +-- arch/powerpc/platforms/powernv/powernv.h | 1 + arch/powerpc/platforms/powernv/smp.c | 4 +++- 3 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index bfbd359..b38cb33 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -242,6 +242,11 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600, */ u64 pnv_first_deep_stop_state; +/* + * Deepest stop idle state. Used when a cpu is offlined + */ +u64 pnv_deepest_stop_state; + static int __init pnv_init_idle_states(void) { struct device_node *power_mgt; @@ -290,8 +295,11 @@ static int __init pnv_init_idle_states(void) } /* -* Set pnv_first_deep_stop_state to the first stop level -* to cause hypervisor state loss +* Set pnv_first_deep_stop_state and pnv_deepest_stop_state. +* pnv_first_deep_stop_state should be set to the first stop +* level to cause hypervisor state loss. +* pnv_deepest_stop_state should be set to the deepest stop +* stop state. */ pnv_first_deep_stop_state = MAX_STOP_STATE; for (i = 0; i < dt_idle_states; i++) { @@ -300,6 +308,9 @@ static int __init pnv_init_idle_states(void) if ((flags[i] & OPAL_PM_LOSE_FULL_CONTEXT) && (pnv_first_deep_stop_state > psscr_rl)) pnv_first_deep_stop_state = psscr_rl; + + if (pnv_deepest_stop_state < psscr_rl) + pnv_deepest_stop_state = psscr_rl; } } diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h index 6dbc0a1..da7c843 100644 --- a/arch/powerpc/platforms/powernv/powernv.h +++ b/arch/powerpc/platforms/powernv/powernv.h @@ -18,6 +18,7 @@ static inline void pnv_pci_shutdown(void) { } #endif extern u32 pnv_get_supported_cpuidle_states(void); +extern u64 pnv_deepest_stop_state; extern void pnv_lpc_init(void); diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index ad7b1a3..f69ceb6 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -182,7 +182,9 @@ static void pnv_smp_cpu_kill_self(void) ppc64_runlatch_off(); - if (idle_states & OPAL_PM_WINKLE_ENABLED) + if (cpu_has_feature(CPU_FTR_ARCH_300)) + srr1 = power_stop(pnv_deepest_stop_state); + else if (idle_states & OPAL_PM_WINKLE_ENABLED) srr1 = power7_winkle(); else if ((idle_states & OPAL_PM_SLEEP_ENABLED) || (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction
Hi Ben, Sorry for the delayed response. On 06/06/2016 03:58 AM, Benjamin Herrenschmidt wrote: > On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote: >> @@ -61,8 +72,13 @@ save_sprs_to_stack: >> * Note all register i.e per-core, per-subcore or per-thread is saved >> * here since any thread in the core might wake up first >> */ >> +BEGIN_FTR_SECTION >> + mfspr r3,SPRN_PTCR >> + std r3,_PTCR(r1) >> +FTR_SECTION_ELSE >> mfspr r3,SPRN_SDR1 >> std r3,_SDR1(r1) >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) > > This is the only new SPR we care about in P9 ? > After reviewing ISA again, I've identified LMRR, LMSER and ASDR also need to be restored. I've fixed this in v6. Thanks, Shreyas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF
On 2016/06/07 03:56PM, Alexei Starovoitov wrote: > On Tue, Jun 07, 2016 at 07:02:23PM +0530, Naveen N. Rao wrote: > > PPC64 eBPF JIT compiler. > > > > Enable with: > > echo 1 > /proc/sys/net/core/bpf_jit_enable > > or > > echo 2 > /proc/sys/net/core/bpf_jit_enable > > > > ... to see the generated JIT code. This can further be processed with > > tools/net/bpf_jit_disasm. > > > > With CONFIG_TEST_BPF=m and 'modprobe test_bpf': > > test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed] > > > > ... on both ppc64 BE and LE. > > Nice. That's even better than on x64 which cannot jit one test: > test_bpf: #262 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 168 PASS > which was designed specifically to hit x64 jit pass limit. > ppc jit has predicatble number of passes and doesn't have this problem > as expected. Great. Yes, that's thanks to the clever handling of conditional branches by Matt -- we always emit 2 instructions for this reason (encoded in PPC_BCC() macro). > > > The details of the approach are documented through various comments in > > the code. > > > > Cc: Matt Evans > > Cc: Denis Kirjanov > > Cc: Michael Ellerman > > Cc: Paul Mackerras > > Cc: Alexei Starovoitov > > Cc: Daniel Borkmann > > Cc: "David S. Miller" > > Cc: Ananth N Mavinakayanahalli > > Signed-off-by: Naveen N. Rao > > --- > > arch/powerpc/Kconfig | 3 +- > > arch/powerpc/include/asm/asm-compat.h | 2 + > > arch/powerpc/include/asm/ppc-opcode.h | 20 +- > > arch/powerpc/net/Makefile | 4 + > > arch/powerpc/net/bpf_jit.h| 53 +- > > arch/powerpc/net/bpf_jit64.h | 102 > > arch/powerpc/net/bpf_jit_asm64.S | 180 +++ > > arch/powerpc/net/bpf_jit_comp64.c | 956 > > ++ > > 8 files changed, 1317 insertions(+), 3 deletions(-) > > create mode 100644 arch/powerpc/net/bpf_jit64.h > > create mode 100644 arch/powerpc/net/bpf_jit_asm64.S > > create mode 100644 arch/powerpc/net/bpf_jit_comp64.c > > don't see any issues with the code. > Thank you for working on this. > > Acked-by: Alexei Starovoitov Thanks, Alexei! Regards, Naveen ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] drivers/net/fsl_ucc: Do not prefix header guard with CONFIG_
From: Andreas Ziegler Date: Wed, 8 Jun 2016 11:40:28 +0200 > The CONFIG_ prefix should only be used for options which > can be configured through Kconfig and not for guarding headers. > > Signed-off-by: Andreas Ziegler Applied. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 1/7] dt-bindings: Update QorIQ TMU thermal bindings
On Tue, Jun 07, 2016 at 11:27:34AM +0800, Jia Hongtao wrote: > For different types of SoC the sensor id and endianness may vary. > "#thermal-sensor-cells" is used to provide sensor id information. > "little-endian" property is to tell the endianness of TMU. > > Signed-off-by: Jia Hongtao > --- > Changes for V2: > * Remove formatting chnages. > > Documentation/devicetree/bindings/thermal/qoriq-thermal.txt | 7 +++ > 1 file changed, 7 insertions(+) Acked-by: Rob Herring ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote: > This one is preparing patch for next one: > PCI: Let pci_mmap_page_range() take resource addr > > We need to pass extra resource pointer to avoid searching that again > for powerpc and microblaze prot set operation. I'm not convinced yet that the extra resource pointer is necessary. Microblaze does look up the resource in pci_mmap_page_range(), but it never actually uses it. It *looks* like it uses it, but that code is actually dead and I think we should apply the first patch below. That leaves powerpc as the only arch that would use this extra resource pointer. It uses it in __pci_mmap_set_pgprot() to help decide whether to make a normal uncacheable mapping or a write- combining one. There's nothing here that's specific to the powerpc architecture, and I don't think we should add this parameter just to cater to powerpc. There are two cases where __pci_mmap_set_pgprot() on powerpc does something based on the resource: 1) We're using procfs to mmap I/O port space after we requested write-combining, e.g., we did this: ioctl(fd, PCIIOC_MMAP_IS_IO); # request I/O port space ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining mmap(fd, ...) On powerpc, we ignore the write-combining request in this case. I think we can handle this case by applying the second patch below to ignore write-combining on I/O space for all arches, not just powerpc. 2) We're using sysfs to mmap resourceN (not resourceN_wc), and the resource is prefetchable. On powerpc, we turn *on* write-combining, even though the user didn't ask for it. I'm not sure this case is actually safe, because it changes the ordering properties. If it *is* safe, we could enable write- combining in pci_mmap_resource(), where we already have the resource and it could be done for all arches. This case is not strictly necessary, except to avoid a performance regression, because the user could have mapped resourceN_wc to explicitly request write-combining. > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index d319a9c..5bbe20c 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, > struct bin_attribute *attr, > pci_resource_to_user(pdev, i, res, &start, &end); > vma->vm_pgoff += start >> PAGE_SHIFT; > mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io; > - return pci_mmap_page_range(pdev, vma, mmap_type, write_combine); > + return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine); > } > > static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj, > diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c > index 3f155e7..f19ee2a 100644 > --- a/drivers/pci/proc.c > +++ b/drivers/pci/proc.c > @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct > vm_area_struct *vma) > if (i >= PCI_ROM_RESOURCE) > return -ENODEV; > > - ret = pci_mmap_page_range(dev, vma, > + ret = pci_mmap_page_range(dev, &dev->resource[i], vma, > fpriv->mmap_state, > fpriv->write_combine); > if (ret < 0) > diff --git a/include/linux/pci.h b/include/linux/pci.h > index b67e4df..3c1a0f4 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -70,6 +70,12 @@ enum pci_mmap_state { > pci_mmap_mem > }; > > +struct vm_area_struct; > +/* Map a range of PCI memory or I/O space for a device into user space */ > +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res, > + struct vm_area_struct *vma, > + enum pci_mmap_state mmap_state, int write_combine); > + > /* > * For PCI devices, the region numbers are assigned this way: > */ commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb Author: Bjorn Helgaas Date: Wed Jun 8 14:00:14 2016 -0500 microblaze/PCI: Remove useless __pci_mmap_set_pgprot() The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc, where it computes either an uncacheable pgprot_t or a write-combining one. But on microblaze, we always use the regular uncacheable pgprot_t. Remove the useless code in __pci_mmap_set_pgprot() and inline the pgprot_noncached() at the only caller. Signed-off-by: Bjorn Helgaas diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 14cba60..1974567 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev, } /* - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci - * device mapping. - */ -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *
Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction
On Wed, 2016-06-08 at 22:31 +0530, Shreyas B Prabhu wrote: > Hi Ben, > > Sorry for the delayed response. > > On 06/06/2016 03:58 AM, Benjamin Herrenschmidt wrote: > > > > On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote: > > > > > > @@ -61,8 +72,13 @@ save_sprs_to_stack: > > > * Note all register i.e per-core, per-subcore or per-thread > > > is saved > > > * here since any thread in the core might wake up first > > > */ > > > +BEGIN_FTR_SECTION > > > + mfspr r3,SPRN_PTCR > > > + std r3,_PTCR(r1) > > > +FTR_SECTION_ELSE > > > mfspr r3,SPRN_SDR1 > > > std r3,_SDR1(r1) > > > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) > > This is the only new SPR we care about in P9 ? > > > After reviewing ISA again, I've identified LMRR, LMSER and ASDR also > need to be restored. I've fixed this in v6. LMRR and LMSER are used the load monitored patch set. There they will get restored when we context switch back to userspace. It probably doesn't hurt that much but you don't need to restore them here. They are not used in the kernel. It escapes me what ASDR is right now. Mikey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas wrote: > > Microblaze does look up the resource in pci_mmap_page_range(), but it > never actually uses it. It *looks* like it uses it, but that code is > actually dead and I think we should apply the first patch below. Good one. > > That leaves powerpc as the only arch that would use this extra > resource pointer. It uses it in __pci_mmap_set_pgprot() to help > decide whether to make a normal uncacheable mapping or a write- > combining one. There's nothing here that's specific to the powerpc > architecture, and I don't think we should add this parameter just to > cater to powerpc. > > There are two cases where __pci_mmap_set_pgprot() on powerpc does > something based on the resource: > > 1) We're using procfs to mmap I/O port space after we requested > write-combining, e.g., we did this: > >ioctl(fd, PCIIOC_MMAP_IS_IO); # request I/O port space >ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining >mmap(fd, ...) > > On powerpc, we ignore the write-combining request in this case. > > I think we can handle this case by applying the second patch > below to ignore write-combining on I/O space for all arches, not > just powerpc. > > 2) We're using sysfs to mmap resourceN (not resourceN_wc), and > the resource is prefetchable. On powerpc, we turn *on* > write-combining, even though the user didn't ask for it. > > I'm not sure this case is actually safe, because it changes the > ordering properties. If it *is* safe, we could enable write- > combining in pci_mmap_resource(), where we already have the > resource and it could be done for all arches. > > This case is not strictly necessary, except to avoid a > performance regression, because the user could have mapped > resourceN_wc to explicitly request write-combining. > Agreed. > > commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb > Author: Bjorn Helgaas > Date: Wed Jun 8 14:00:14 2016 -0500 > > microblaze/PCI: Remove useless __pci_mmap_set_pgprot() > > The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc, > where it computes either an uncacheable pgprot_t or a write-combining one. > But on microblaze, we always use the regular uncacheable pgprot_t. > > Remove the useless code in __pci_mmap_set_pgprot() and inline the > pgprot_noncached() at the only caller. > > Signed-off-by: Bjorn Helgaas > > diff --git a/arch/microblaze/pci/pci-common.c > b/arch/microblaze/pci/pci-common.c > index 14cba60..1974567 100644 > --- a/arch/microblaze/pci/pci-common.c > +++ b/arch/microblaze/pci/pci-common.c > @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct > pci_dev *dev, > } > > /* > - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci > - * device mapping. > - */ > -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource > *rp, > - pgprot_t protection, > - enum pci_mmap_state mmap_state, > - int write_combine) > -{ > - pgprot_t prot = protection; > - > - /* Write combine is always 0 on non-memory space mappings. On > -* memory space, if the user didn't pass 1, we check for a > -* "prefetchable" resource. This is a bit hackish, but we use > -* this to workaround the inability of /sysfs to provide a write > -* combine bit > -*/ > - if (mmap_state != pci_mmap_mem) > - write_combine = 0; > - else if (write_combine == 0) { > - if (rp->flags & IORESOURCE_PREFETCH) > - write_combine = 1; > - } > - > - return pgprot_noncached(prot); > -} > - > -/* > * This one is used by /dev/mem and fbdev who have no clue about the > * PCI device, it tries to find the PCI device first and calls the > * above routine > @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct > vm_area_struct *vma, > return -EINVAL; > > vma->vm_pgoff = offset >> PAGE_SHIFT; > - vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp, > - vma->vm_page_prot, > - mmap_state, write_combine); > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, >vma->vm_end - vma->vm_start, > vma->vm_page_prot); > Acked-by: Yinghai Lu > > > commit 962972ee5e0ba6ceb680cb182bad65f8886586a6 > Author: Bjorn Helgaas > Date: Wed Jun 8 14:46:54 2016 -0500 > > PCI: Ignore write-combining when mapping I/O port space > > PCI exposes files like /proc/bus/pci/00/00.0 in procfs. These files > support operations like this: > > ioctl(fd, PCIIOC_MMAP_IS_IO)
[PATCH v3 3/7] powerpc: use the new LED disk activity trigger
- dts: rename 'ide-disk' to 'disk-activity' - defconfig: rename 'ADB_PMU_LED_IDE' to 'ADB_PMU_LED_DISK' Cc: Joseph Jezak Cc: Nico Macrionitis Cc: Jörg Sommer Signed-off-by: Stephan Linz --- arch/powerpc/boot/dts/mpc8315erdb.dts | 2 +- arch/powerpc/boot/dts/mpc8377_rdb.dts | 2 +- arch/powerpc/boot/dts/mpc8378_rdb.dts | 2 +- arch/powerpc/boot/dts/mpc8379_rdb.dts | 2 +- arch/powerpc/configs/pmac32_defconfig | 2 +- arch/powerpc/configs/ppc6xx_defconfig | 2 +- drivers/macintosh/Kconfig | 13 ++--- drivers/macintosh/via-pmu-led.c | 4 ++-- 8 files changed, 14 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/boot/dts/mpc8315erdb.dts b/arch/powerpc/boot/dts/mpc8315erdb.dts index 4354684..ca5139e 100644 --- a/arch/powerpc/boot/dts/mpc8315erdb.dts +++ b/arch/powerpc/boot/dts/mpc8315erdb.dts @@ -472,7 +472,7 @@ hdd { gpios = <&mcu_pio 1 0>; - linux,default-trigger = "ide-disk"; + linux,default-trigger = "disk-activity"; }; }; }; diff --git a/arch/powerpc/boot/dts/mpc8377_rdb.dts b/arch/powerpc/boot/dts/mpc8377_rdb.dts index 2b4b653..e326139 100644 --- a/arch/powerpc/boot/dts/mpc8377_rdb.dts +++ b/arch/powerpc/boot/dts/mpc8377_rdb.dts @@ -496,7 +496,7 @@ hdd { gpios = <&mcu_pio 1 0>; - linux,default-trigger = "ide-disk"; + linux,default-trigger = "disk-activity"; }; }; }; diff --git a/arch/powerpc/boot/dts/mpc8378_rdb.dts b/arch/powerpc/boot/dts/mpc8378_rdb.dts index 74b6a53..71842fc 100644 --- a/arch/powerpc/boot/dts/mpc8378_rdb.dts +++ b/arch/powerpc/boot/dts/mpc8378_rdb.dts @@ -480,7 +480,7 @@ hdd { gpios = <&mcu_pio 1 0>; - linux,default-trigger = "ide-disk"; + linux,default-trigger = "disk-activity"; }; }; }; diff --git a/arch/powerpc/boot/dts/mpc8379_rdb.dts b/arch/powerpc/boot/dts/mpc8379_rdb.dts index 3b5cbac..e442a29 100644 --- a/arch/powerpc/boot/dts/mpc8379_rdb.dts +++ b/arch/powerpc/boot/dts/mpc8379_rdb.dts @@ -446,7 +446,7 @@ hdd { gpios = <&mcu_pio 1 0>; - linux,default-trigger = "ide-disk"; + linux,default-trigger = "disk-activity"; }; }; }; diff --git a/arch/powerpc/configs/pmac32_defconfig b/arch/powerpc/configs/pmac32_defconfig index ea8705f..3f6c9a6 100644 --- a/arch/powerpc/configs/pmac32_defconfig +++ b/arch/powerpc/configs/pmac32_defconfig @@ -158,7 +158,7 @@ CONFIG_ADB=y CONFIG_ADB_CUDA=y CONFIG_ADB_PMU=y CONFIG_ADB_PMU_LED=y -CONFIG_ADB_PMU_LED_IDE=y +CONFIG_ADB_PMU_LED_DISK=y CONFIG_PMAC_APM_EMU=m CONFIG_PMAC_MEDIABAY=y CONFIG_PMAC_BACKLIGHT=y diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig index 99ccbeba..1dde0be 100644 --- a/arch/powerpc/configs/ppc6xx_defconfig +++ b/arch/powerpc/configs/ppc6xx_defconfig @@ -442,7 +442,7 @@ CONFIG_ADB=y CONFIG_ADB_CUDA=y CONFIG_ADB_PMU=y CONFIG_ADB_PMU_LED=y -CONFIG_ADB_PMU_LED_IDE=y +CONFIG_ADB_PMU_LED_DISK=y CONFIG_PMAC_APM_EMU=y CONFIG_PMAC_MEDIABAY=y CONFIG_PMAC_BACKLIGHT=y diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig index 3e8b29e..d28690f 100644 --- a/drivers/macintosh/Kconfig +++ b/drivers/macintosh/Kconfig @@ -96,19 +96,18 @@ config ADB_PMU_LED Support the front LED on Power/iBooks as a generic LED that can be triggered by any of the supported triggers. To get the behaviour of the old CONFIG_BLK_DEV_IDE_PMAC_BLINK, select this - and the ide-disk LED trigger and configure appropriately through - sysfs. + and the disk LED trigger and configure appropriately through sysfs. -config ADB_PMU_LED_IDE - bool "Use front LED as IDE LED by default" +config ADB_PMU_LED_DISK + bool "Use front LED as DISK LED by default" depends on ADB_PMU_LED depends on LEDS_CLASS depends on IDE_GD_ATA select LEDS_TRIGGERS - select LEDS_TRIGGER_IDE_DISK + select LEDS_TRIGGER_DISK help - This option makes the front LED default to the IDE trigger - so that it blinks on IDE activity. + This option makes the front LED default to the disk trigger + so that it blinks on disk activity. config PMAC_SMU bool "Support for SMU based PowerMacs" diff --git a/drivers/macintosh/via-pmu-led.c b/drivers/macintosh/via-pmu-led.c index 19c3718..ae067ab 100644 --- a/drivers/macintosh/via-pmu-led.c +++ b/drivers/macintosh/via-pmu-led.c @@ -73,8 +73,8 @@ static void pmu_led_set(struct led_classdev *led_cdev, static struct led_classdev pmu_led = { .name = "pmu-led::front", -#ifdef CONFIG_ADB_PMU_LED_IDE - .default_trigger = "ide-disk", +#ifdef CONFIG_ADB_
Re: [RFC] Implementing HUGEPAGE on MPC 8xx
Hello Christophe. I’m surprised there is still any interest in this processor family :) On Jun 8, 2016, at 12:03 AM, Christophe Leroy wrote: > MPC 8xx has several page sizes: 4k, 16k, 512k and 8M. > Today, 4k and 16k sizes are implemented as normal page sizes and 8M is used > for mapping linear memory space in kernel. > > I'd like to implement HUGE PAGE to reduce TLB misses from user apps. My original plan was to implement the TLB miss handler in three lines of code. I haven’t investigated recently, but I know the amount of code has grown substantially :) > In 4k mode, PAGE offset is 12 bits, PTE offset is 10 bits and PGD offset is > 10 bits > In 16k mode, PAGE offset is 14 bits, PTE offset is 12 bits and PGD offset is > 6 bits Since the 8xx systems typically have rather small real memory, I was considering a combination of 4k and 512k pages as an attempt to maximize real memory utilization. The 4k pages in the PTE tables as today, and the 512k flagged in the PGD and just loaded from there. I don’t know if 16k is a big enough win (unless it’s the “standard” page size to keep TLBmiss as simple as possible), or if 8M is terribly useful from user space. > From your point of view, what would be the best approach to extend support of > HUGE PAGES to PPC_8xx ? > Would the good starting point be to implement a hugepagetlb-8xx.c from > hugepagetlb-book3e.c ? I guess that is the place to start. When I first thought about this many years ago, I was hoping to map shared libraries and properly behaving programs. The mechanism I considered to do this was either inspection of the section headers, using some section flags, or maybe Aux Vector to set up mmap() to hugetlb at run-time. Good Luck. — Dan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
On Wed, Jun 8, 2016 at 3:35 PM, Yinghai Lu wrote: > At the same time, can you kill __pci_mmap_set_pgprot() for powerpc. Can you please put your two patches and this attached one into to pci/next? Then I could send updated PCI: Let pci_mmap_page_range() take resource address. Thanks Yinghai From: Bjorn Helgaas Subject: [PATCH] powerpc/PCI: Remove __pci_mmap_set_pgprot() PCI: Ignore write-combining when mapping I/O port space already handle the io port mmap path. For mmio mmap path, caller should state that correctly if write_combine is really needed. via proc path it should look like: mmap(fd, ...) # default is I/O, non-combining ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining ioctl(fd, PCIIOC_MMAP_IS_MEM); # request memory space mmap(fd, ...) sysfs path, it should use resource]?]_wc. Signed-off-by: Bjorn Helgaas --- arch/powerpc/kernel/pci-common.c | 37 - 1 file changed, 4 insertions(+), 33 deletions(-) Index: linux-2.6/arch/powerpc/kernel/pci-common.c === --- linux-2.6.orig/arch/powerpc/kernel/pci-common.c +++ linux-2.6/arch/powerpc/kernel/pci-common.c @@ -356,36 +356,6 @@ static struct resource *__pci_mmap_make_ } /* - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci - * device mapping. - */ -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp, - pgprot_t protection, - enum pci_mmap_state mmap_state, - int write_combine) -{ - - /* Write combine is always 0 on non-memory space mappings. On - * memory space, if the user didn't pass 1, we check for a - * "prefetchable" resource. This is a bit hackish, but we use - * this to workaround the inability of /sysfs to provide a write - * combine bit - */ - if (mmap_state != pci_mmap_mem) - write_combine = 0; - else if (write_combine == 0) { - if (rp->flags & IORESOURCE_PREFETCH) - write_combine = 1; - } - - /* XXX would be nice to have a way to ask for write-through */ - if (write_combine) - return pgprot_noncached_wc(protection); - else - return pgprot_noncached(protection); -} - -/* * This one is used by /dev/mem and fbdev who have no clue about the * PCI device, it tries to find the PCI device first and calls the * above routine @@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev * return -EINVAL; vma->vm_pgoff = offset >> PAGE_SHIFT; - vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp, - vma->vm_page_prot, - mmap_state, write_combine); + if (write_combine) + vma->vm_page_prot = pgprot_noncached_wc(vma->vm_page_prot); + else + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, vma->vm_page_prot); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/nohash: Fix build break with 4K pages
On Wed, 2016-06-08 at 20:19 +0530, Aneesh Kumar K.V wrote: > Michael Ellerman writes: > > > Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are > > allocating fragments" renamed page_table_free() to pte_fragment_free(). > > One occurrence was mistyped as pte_fragment_fre(). > > > > This only breaks the nohash 4K page build, which is not the default or > > enabled in any defconfig. > > Can you share the .config. I will add it to the build test. It was a randconfig, it still doesn't build even with this patch: http://kisskb.ellerman.id.au/kisskb/buildresult/12705111/ cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] Implementing HUGEPAGE on MPC 8xx
On Wed, 2016-06-08 at 09:03 +0200, Christophe Leroy wrote: > In see in the current ppc kernel that for PPC32, SYS_SUPPORTS_HUGETLBFS > is selected only if we have PHYS_64BIT. > What is the reason for only implementing HUGETLBFS with 64 bits phys > addresses ? That's not for PPC32 in general -- it's for 32-bit FSL Book E. The reason for the limitation is that there are separate TLB miss handlers depending on whether PHYS_64BIT is enabled, and we didn't want to have to implement hugetlb support in both of them unless there was actual demand for it. > From your point of view, what would be the best approach to extend > support of HUGE PAGES to PPC_8xx ? > Would the good starting point be to implement a hugepagetlb-8xx.c from > hugepagetlb-book3e.c ? Yes. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/5] selftests/powerpc: Check for VSX preservation across userspace preemption
Yay for tests! I have a few minor nits, and one more major one (rc == 2 below). > +/* > + * Copyright 2015, Cyril Bur, IBM Corp. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version > + * 2 of the License, or (at your option) any later version. > + */ I realise this is well past a lost cause by now, but isn't the idea to be version 2, not version 2 or later? > + > +#include "../basic_asm.h" > +#include "../vsx_asm.h" > + Some of your other functions start with a comment. That would be super helpful here - I'm still not super comfortable I understand the calling convention. > +FUNC_START(check_vsx) > + PUSH_BASIC_STACK(32) > + std r3,STACK_FRAME_PARAM(0)(sp) > + addi r3, r3, 16 * 12 #Second half of array > + bl store_vsx > + ld r3,STACK_FRAME_PARAM(0)(sp) > + bl vsx_memcmp > + POP_BASIC_STACK(32) > + blr > +FUNC_END(check_vsx) > + > +long vsx_memcmp(vector int *a) { > + vector int zero = {0,0,0,0}; > + int i; > + > + FAIL_IF(a != varray); > + > + for(i = 0; i < 12; i++) { > + if (memcmp(&a[i + 12], &zero, 16) == 0) { > + fprintf(stderr, "Detected zero from the VSX reg %d\n", > i + 12); > + return 1; > + } > + } > + > + if (memcmp(a, &a[12], 12 * 16)) { I'm somewhat confused as to how this comparison works. You're comparing the new saved ones to the old saved ones, yes? > + long *p = (long *)a; > + fprintf(stderr, "VSX mismatch\n"); > + for (i = 0; i < 24; i=i+2) > + fprintf(stderr, "%d: 0x%08lx%08lx | 0x%08lx%08lx\n", > + i/2 + i%2 + 20, p[i], p[i + 1], p[i + > 24], p[i + 25]); > + return 1; > + } > + return 0; > +} > + > +void *preempt_vsx_c(void *p) > +{ > + int i, j; > + long rc; > + srand(pthread_self()); > + for (i = 0; i < 12; i++) > + for (j = 0; j < 4; j++) { > + varray[i][j] = rand(); > + /* Don't want zero because it hides kernel problems */ > + if (varray[i][j] == 0) > + j--; > + } > + rc = preempt_vsx(varray, &threads_starting, &running); > + if (rc == 2) How would rc == 2? AIUI, preempt_vsx returns the value of check_vsx, which in turn returns the value of vsx_memcmp, which returns 1 or 0. > + fprintf(stderr, "Caught zeros in VSX compares\n"); Isn't it zeros or a mismatched value? > + return (void *)rc; > +} > + > +int test_preempt_vsx(void) > +{ > + int i, rc, threads; > + pthread_t *tids; > + > + threads = sysconf(_SC_NPROCESSORS_ONLN) * THREAD_FACTOR; > + tids = malloc(threads * sizeof(pthread_t)); > + FAIL_IF(!tids); > + > + running = true; > + threads_starting = threads; > + for (i = 0; i < threads; i++) { > + rc = pthread_create(&tids[i], NULL, preempt_vsx_c, NULL); > + FAIL_IF(rc); > + } > + > + setbuf(stdout, NULL); > + /* Not really nessesary but nice to wait for every thread to start */ > + printf("\tWaiting for %d workers to start...", threads_starting); > + while(threads_starting) > + asm volatile("": : :"memory"); I think __sync_synchronise() might be ... more idiomatic or something? Not super fussy. > + printf("done\n"); > + > + printf("\tWaiting for %d seconds to let some workers get preempted...", > PREEMPT_TIME); > + sleep(PREEMPT_TIME); > + printf("done\n"); > + > + printf("\tStopping workers..."); > + /* > + * Working are checking this value every loop. In preempt_vsx 'cmpwi > r5,0; bne 2b'. > + * r5 will have loaded the value of running. > + */ > + running = 0; Do you need some sort of synchronisation here? You're assuming it eventually gets to the threads, which is of course true, but maybe it would be a good idea to synchronise it more explicitly? Again, not super fussy. > + for (i = 0; i < threads; i++) { > + void *rc_p; > + pthread_join(tids[i], &rc_p); > + > + /* > + * Harness will say the fail was here, look at why preempt_vsx > + * returned > + */ > + if ((long) rc_p) > + printf("oops\n"); > + FAIL_IF((long) rc_p); > + } > + printf("done\n"); > + > + return 0; > +} > + Regards, Daniel signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 0/3] POWER9 Load Monitor Support
This patches series adds support for the POWER9 Load Monitor instruction (ldmx) based on work from Jack Miller. The first patch is a clean up of the FSCR handling. The second patch adds the actual ldmx support to the kernel. The third patch is a couple of ldmx selftests. v7: - Suggestions from the "prestigious" mpe. - PATCH 1/3: - Use current->thread.fscr rather than what the hardware gives us. - PATCH 2/3: - Use current->thread.fscr rather than what the hardware gives us. - PATCH 3/3: - no change. v6: - PATCH 1/3: - Suggestions from mpe. - Init the FSCR using existing INIT_THREAD macro rather than init_fscr() function. - Set fscr when taking DSCR exception in facility_unavailable_exception(). - PATCH 2/3: - Remove erroneous semicolons in restore_sprs(). - PATCH 3/3: - no change. v5: - PATCH 1/3: - Change FSCR cleanup more extensive. - PATCH 2/3: - Moves FSCR_LM clearing to new init_fscr(). - PATCH 3/3: - Added test cases to .gitignore. - Removed test again PPC_FEATURE2_EBB since it's not needed. - Added parenthesis on input parameter usage for LDMX() macro. Jack Miller (2): powerpc: Load Monitor Register Support powerpc: Load Monitor Register Tests Michael Neuling (1): powerpc: Improve FSCR init and context switching ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 1/3] powerpc: Improve FSCR init and context switching
This fixes a few issues with FSCR init and switching. In this patch: powerpc: Create context switch helpers save_sprs() and restore_sprs() Author: Anton Blanchard commit 152d523e6307c7152f9986a542f873b5c5863937 We moved the setting of the FSCR register from inside an CPU_FTR_ARCH_207S section to inside just a CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where the FSCR doesn't exist. This is harmless but we shouldn't do it. Also, we can simplify the FSCR context switch. We don't need to go through the calculation involving dscr_inherit. We can just restore what we saved last time. Also, we currently don't explicitly init the FSCR for userspace applications. Currently we init FSCR on boot in __init_fscr: and then the first task inherits based on that. Currently it works but is delicate. This adds the initial fscr value to INIT_THREAD to explicitly set the FSCR for userspace applications and removes __init_fscr: boot time init. Based on patch by Jack Miller. Signed-off-by: Michael Neuling --- arch/powerpc/include/asm/processor.h | 1 + arch/powerpc/kernel/cpu_setup_power.S | 10 -- arch/powerpc/kernel/process.c | 12 arch/powerpc/kernel/traps.c | 3 ++- 4 files changed, 7 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index 009fab1..1833fe9 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -347,6 +347,7 @@ struct thread_struct { .fs = KERNEL_DS, \ .fpexc_mode = 0, \ .ppr = INIT_PPR, \ + .fscr = FSCR_TAR | FSCR_EBB \ } #endif diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S index 584e119..75f98c8 100644 --- a/arch/powerpc/kernel/cpu_setup_power.S +++ b/arch/powerpc/kernel/cpu_setup_power.S @@ -49,7 +49,6 @@ _GLOBAL(__restore_cpu_power7) _GLOBAL(__setup_cpu_power8) mflrr11 - bl __init_FSCR bl __init_PMU bl __init_hvmode_206 mtlrr11 @@ -67,7 +66,6 @@ _GLOBAL(__setup_cpu_power8) _GLOBAL(__restore_cpu_power8) mflrr11 - bl __init_FSCR bl __init_PMU mfmsr r3 rldicl. r0,r3,4,63 @@ -86,7 +84,6 @@ _GLOBAL(__restore_cpu_power8) _GLOBAL(__setup_cpu_power9) mflrr11 - bl __init_FSCR bl __init_hvmode_206 mtlrr11 beqlr @@ -102,7 +99,6 @@ _GLOBAL(__setup_cpu_power9) _GLOBAL(__restore_cpu_power9) mflrr11 - bl __init_FSCR mfmsr r3 rldicl. r0,r3,4,63 mtlrr11 @@ -155,12 +151,6 @@ __init_LPCR: isync blr -__init_FSCR: - mfspr r3,SPRN_FSCR - ori r3,r3,FSCR_TAR|FSCR_DSCR|FSCR_EBB - mtspr SPRN_FSCR,r3 - blr - __init_HFSCR: mfspr r3,SPRN_HFSCR ori r3,r3,HFSCR_TAR|HFSCR_TM|HFSCR_BHRB|HFSCR_PM|\ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index e2f12cb..74ea8db 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1023,18 +1023,11 @@ static inline void restore_sprs(struct thread_struct *old_thread, #ifdef CONFIG_PPC_BOOK3S_64 if (cpu_has_feature(CPU_FTR_DSCR)) { u64 dscr = get_paca()->dscr_default; - u64 fscr = old_thread->fscr & ~FSCR_DSCR; - - if (new_thread->dscr_inherit) { + if (new_thread->dscr_inherit) dscr = new_thread->dscr; - fscr |= FSCR_DSCR; - } if (old_thread->dscr != dscr) mtspr(SPRN_DSCR, dscr); - - if (old_thread->fscr != fscr) - mtspr(SPRN_FSCR, fscr); } if (cpu_has_feature(CPU_FTR_ARCH_207S)) { @@ -1045,6 +1038,9 @@ static inline void restore_sprs(struct thread_struct *old_thread, if (old_thread->ebbrr != new_thread->ebbrr) mtspr(SPRN_EBBRR, new_thread->ebbrr); + if (old_thread->fscr != new_thread->fscr) + mtspr(SPRN_FSCR, new_thread->fscr); + if (old_thread->tar != new_thread->tar) mtspr(SPRN_TAR, new_thread->tar); } diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 9229ba6..667cf78 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1418,7 +1418,8 @@ void facility_unavailable_exception(struct pt_regs *regs) rd = (instword >> 21) & 0x1f; current->thread.dscr = regs->gpr[rd]; current->thread.dscr_inherit = 1; - mtspr(SPRN_FSCR, value | FSCR_DSCR); + current->thread.fscr |= FSCR_DSCR; + mtspr(SPRN_FSCR, current->thread.fscr); }
[PATCH v7 2/3] powerpc: Load Monitor Register Support
From: Jack Miller This enables new registers, LMRR and LMSER, that can trigger an EBB in userspace code when a monitored load (via the new ldmx instruction) loads memory from a monitored space. This facility is controlled by a new FSCR bit, LM. This patch disables the FSCR LM control bit on task init and enables that bit when a load monitor facility unavailable exception is taken for using it. On context switch, this bit is then used to determine whether the two relevant registers are saved and restored. This is done lazily for performance reasons. Signed-off-by: Jack Miller Signed-off-by: Michael Neuling --- arch/powerpc/include/asm/processor.h | 2 ++ arch/powerpc/include/asm/reg.h | 5 + arch/powerpc/kernel/process.c| 18 ++ arch/powerpc/kernel/traps.c | 9 + 4 files changed, 34 insertions(+) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index 1833fe9..ac7670d 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -314,6 +314,8 @@ struct thread_struct { unsigned long mmcr2; unsignedmmcr0; unsignedused_ebb; + unsigned long lmrr; + unsigned long lmser; #endif }; diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index a0948f4..ce44fe2 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -282,6 +282,8 @@ #define SPRN_HRMOR 0x139 /* Real mode offset register */ #define SPRN_HSRR0 0x13A /* Hypervisor Save/Restore 0 */ #define SPRN_HSRR1 0x13B /* Hypervisor Save/Restore 1 */ +#define SPRN_LMRR 0x32D /* Load Monitor Region Register */ +#define SPRN_LMSER 0x32E /* Load Monitor Section Enable Register */ #define SPRN_IC0x350 /* Virtual Instruction Count */ #define SPRN_VTB 0x351 /* Virtual Time Base */ #define SPRN_LDBAR 0x352 /* LD Base Address Register */ @@ -291,6 +293,7 @@ #define SPRN_PMCR 0x374 /* Power Management Control Register */ /* HFSCR and FSCR bit numbers are the same */ +#define FSCR_LM_LG 11 /* Enable Load Monitor Registers */ #define FSCR_TAR_LG8 /* Enable Target Address Register */ #define FSCR_EBB_LG7 /* Enable Event Based Branching */ #define FSCR_TM_LG 5 /* Enable Transactional Memory */ @@ -300,10 +303,12 @@ #define FSCR_VECVSX_LG 1 /* Enable VMX/VSX */ #define FSCR_FP_LG 0 /* Enable Floating Point */ #define SPRN_FSCR 0x099 /* Facility Status & Control Register */ +#define FSCR_LM __MASK(FSCR_LM_LG) #define FSCR_TAR __MASK(FSCR_TAR_LG) #define FSCR_EBB __MASK(FSCR_EBB_LG) #define FSCR_DSCR__MASK(FSCR_DSCR_LG) #define SPRN_HFSCR 0xbe/* HV=1 Facility Status & Control Register */ +#define HFSCR_LM __MASK(FSCR_LM_LG) #define HFSCR_TAR__MASK(FSCR_TAR_LG) #define HFSCR_EBB__MASK(FSCR_EBB_LG) #define HFSCR_TM __MASK(FSCR_TM_LG) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 74ea8db..2e22f60 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1009,6 +1009,14 @@ static inline void save_sprs(struct thread_struct *t) */ t->tar = mfspr(SPRN_TAR); } + + if (cpu_has_feature(CPU_FTR_ARCH_300)) { + /* Conditionally save Load Monitor registers, if enabled */ + if (t->fscr & FSCR_LM) { + t->lmrr = mfspr(SPRN_LMRR); + t->lmser = mfspr(SPRN_LMSER); + } + } #endif } @@ -1044,6 +1052,16 @@ static inline void restore_sprs(struct thread_struct *old_thread, if (old_thread->tar != new_thread->tar) mtspr(SPRN_TAR, new_thread->tar); } + + if (cpu_has_feature(CPU_FTR_ARCH_300)) { + /* Conditionally restore Load Monitor registers, if enabled */ + if (new_thread->fscr & FSCR_LM) { + if (old_thread->lmrr != new_thread->lmrr) + mtspr(SPRN_LMRR, new_thread->lmrr); + if (old_thread->lmser != new_thread->lmser) + mtspr(SPRN_LMSER, new_thread->lmser); + } + } #endif } diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 667cf78..b2e434b 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1376,6 +1376,7 @@ void facility_unavailable_exception(struct pt_regs *regs) [FSCR_TM_LG] = "TM", [FSCR_EBB_LG] = "EBB", [FSCR_TAR_LG] = "TAR", + [FSCR_LM_LG] = "LM", }; char *facility = "unknown"; u64 value; @@ -1433,6 +1434,14 @@ void facility_unavailable_exception(struct pt_regs *regs) emulate_single_s
[PATCH v7 3/3] powerpc: Load Monitor Register Tests
From: Jack Miller Adds two tests. One is a simple test to ensure that the new registers LMRR and LMSER are properly maintained. The other actually uses the existing EBB test infrastructure to test that LMRR and LMSER behave as documented. Signed-off-by: Jack Miller Signed-off-by: Michael Neuling --- tools/testing/selftests/powerpc/pmu/ebb/.gitignore | 2 + tools/testing/selftests/powerpc/pmu/ebb/Makefile | 2 +- tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c | 143 + tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h | 39 ++ .../selftests/powerpc/pmu/ebb/ebb_lmr_regs.c | 37 ++ tools/testing/selftests/powerpc/reg.h | 5 + 6 files changed, 227 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr_regs.c diff --git a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore index 42bddbe..44b7df1 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore +++ b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore @@ -20,3 +20,5 @@ back_to_back_ebbs_test lost_exception_test no_handler_test cycles_with_mmcr2_test +ebb_lmr +ebb_lmr_regs \ No newline at end of file diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile b/tools/testing/selftests/powerpc/pmu/ebb/Makefile index 8d2279c4..6b0453e 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile +++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile @@ -14,7 +14,7 @@ TEST_PROGS := reg_access_test event_attributes_test cycles_test \ fork_cleanup_test ebb_on_child_test\ ebb_on_willing_child_test back_to_back_ebbs_test \ lost_exception_test no_handler_test\ -cycles_with_mmcr2_test +cycles_with_mmcr2_test ebb_lmr ebb_lmr_regs all: $(TEST_PROGS) diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c new file mode 100644 index 000..c47ebd5 --- /dev/null +++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c @@ -0,0 +1,143 @@ +/* + * Copyright 2016, Jack Miller, IBM Corp. + * Licensed under GPLv2. + */ + +#include +#include + +#include "ebb.h" +#include "ebb_lmr.h" + +#define SIZE (32 * 1024 * 1024) /* 32M */ +#define LM_SIZE0 /* Smallest encoding, 32M */ + +#define SECTIONS 64 /* 1 per bit in LMSER */ +#define SECTION_SIZE (SIZE / SECTIONS) +#define SECTION_LONGS (SECTION_SIZE / sizeof(long)) + +static unsigned long *test_mem; + +static int lmr_count = 0; + +void ebb_lmr_handler(void) +{ + lmr_count++; +} + +void ldmx_full_section(unsigned long *mem, int section) +{ + unsigned long *ptr; + int i; + + for (i = 0; i < SECTION_LONGS; i++) { + ptr = &mem[(SECTION_LONGS * section) + i]; + ldmx((unsigned long) &ptr); + ebb_lmr_reset(); + } +} + +unsigned long section_masks[] = { + 0x8000, + 0xFF00, + 0x000F7000, + 0x8001, + 0xF0F0F0F0F0F0F0F0, + 0x0F0F0F0F0F0F0F0F, + 0x0 +}; + +int ebb_lmr_section_test(unsigned long *mem) +{ + unsigned long *mask = section_masks; + int i; + + for (; *mask; mask++) { + mtspr(SPRN_LMSER, *mask); + printf("Testing mask 0x%016lx\n", mfspr(SPRN_LMSER)); + + for (i = 0; i < 64; i++) { + lmr_count = 0; + ldmx_full_section(mem, i); + if (*mask & (1UL << (63 - i))) + FAIL_IF(lmr_count != SECTION_LONGS); + else + FAIL_IF(lmr_count); + } + } + + return 0; +} + +int ebb_lmr(void) +{ + int i; + + SKIP_IF(!lmr_is_supported()); + + setup_ebb_handler(ebb_lmr_handler); + + ebb_global_enable(); + + FAIL_IF(posix_memalign((void **)&test_mem, SIZE, SIZE) != 0); + + mtspr(SPRN_LMSER, 0); + + FAIL_IF(mfspr(SPRN_LMSER) != 0); + + mtspr(SPRN_LMRR, ((unsigned long)test_mem | LM_SIZE)); + + FAIL_IF(mfspr(SPRN_LMRR) != ((unsigned long)test_mem | LM_SIZE)); + + /* Read every single byte to ensure we get no false positives */ + for (i = 0; i < SECTIONS; i++) + ldmx_full_section(test_mem, i); + + FAIL_IF(lmr_count != 0); + + /* Turn on the first section */ + + mtspr(SPRN_LMSER, (1UL << 63)); + FAIL_IF(mfspr(SPRN_LMSER) != (1UL << 63)); + + /* Enable LM (BESCR) */ + + mtspr(SPRN_BESCR, mfspr(SPRN_BESCR) | BESCR_LME); + FAIL_IF(!(mfspr(SPRN_BESCR) & BESCR_LME)); + + ldmx((unsigned long)&test_mem); + + FAIL_IF(lmr_count != 1);
Re: [PATCH 1/5] selftests/powerpc: Check for VSX preservation across userspace preemption
On Thu, 2016-06-09 at 11:35 +1000, Daniel Axtens wrote: > > +/* > > + * Copyright 2015, Cyril Bur, IBM Corp. > > + * > > + * This program is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU General Public License > > + * as published by the Free Software Foundation; either version > > + * 2 of the License, or (at your option) any later version. > > + */ > I realise this is well past a lost cause by now, but isn't the idea to > be version 2, not version 2 or later? No. I asked the powers that be and apparently for new code we're supposed to use v2 or later. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF
Naveen, can you point out where in the patch you update the variable: idx, a member of codegen_contex structure? Somehow I am unable to figure it out. I can only see that we set it to 0 in the bpf_int_jit_compile function. Since all your test cases pass, I am clearly overlooking something. Thanks Nilay On 7 June 2016 at 08:32, Naveen N. Rao wrote: > PPC64 eBPF JIT compiler. > > Enable with: > echo 1 > /proc/sys/net/core/bpf_jit_enable > or > echo 2 > /proc/sys/net/core/bpf_jit_enable > > ... to see the generated JIT code. This can further be processed with > tools/net/bpf_jit_disasm. > > With CONFIG_TEST_BPF=m and 'modprobe test_bpf': > test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed] > > ... on both ppc64 BE and LE. > > The details of the approach are documented through various comments in > the code. > > Cc: Matt Evans > Cc: Denis Kirjanov > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: Alexei Starovoitov > Cc: Daniel Borkmann > Cc: "David S. Miller" > Cc: Ananth N Mavinakayanahalli > Signed-off-by: Naveen N. Rao > --- > arch/powerpc/Kconfig | 3 +- > arch/powerpc/include/asm/asm-compat.h | 2 + > arch/powerpc/include/asm/ppc-opcode.h | 20 +- > arch/powerpc/net/Makefile | 4 + > arch/powerpc/net/bpf_jit.h| 53 +- > arch/powerpc/net/bpf_jit64.h | 102 > arch/powerpc/net/bpf_jit_asm64.S | 180 +++ > arch/powerpc/net/bpf_jit_comp64.c | 956 > ++ > 8 files changed, 1317 insertions(+), 3 deletions(-) > create mode 100644 arch/powerpc/net/bpf_jit64.h > create mode 100644 arch/powerpc/net/bpf_jit_asm64.S > create mode 100644 arch/powerpc/net/bpf_jit_comp64.c > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 01f7464..ee82f9a 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -128,7 +128,8 @@ config PPC > select IRQ_FORCED_THREADING > select HAVE_RCU_TABLE_FREE if SMP > select HAVE_SYSCALL_TRACEPOINTS > - select HAVE_CBPF_JIT > + select HAVE_CBPF_JIT if !PPC64 > + select HAVE_EBPF_JIT if PPC64 > select HAVE_ARCH_JUMP_LABEL > select ARCH_HAVE_NMI_SAFE_CMPXCHG > select ARCH_HAS_GCOV_PROFILE_ALL > diff --git a/arch/powerpc/include/asm/asm-compat.h > b/arch/powerpc/include/asm/asm-compat.h > index dc85dcb..cee3aa0 100644 > --- a/arch/powerpc/include/asm/asm-compat.h > +++ b/arch/powerpc/include/asm/asm-compat.h > @@ -36,11 +36,13 @@ > #define PPC_MIN_STKFRM 112 > > #ifdef __BIG_ENDIAN__ > +#define LHZX_BEstringify_in_c(lhzx) > #define LWZX_BEstringify_in_c(lwzx) > #define LDX_BE stringify_in_c(ldx) > #define STWX_BEstringify_in_c(stwx) > #define STDX_BEstringify_in_c(stdx) > #else > +#define LHZX_BEstringify_in_c(lhbrx) > #define LWZX_BEstringify_in_c(lwbrx) > #define LDX_BE stringify_in_c(ldbrx) > #define STWX_BEstringify_in_c(stwbrx) > diff --git a/arch/powerpc/include/asm/ppc-opcode.h > b/arch/powerpc/include/asm/ppc-opcode.h > index fd8d640..6a77d130 100644 > --- a/arch/powerpc/include/asm/ppc-opcode.h > +++ b/arch/powerpc/include/asm/ppc-opcode.h > @@ -142,9 +142,11 @@ > #define PPC_INST_ISEL 0x7c1e > #define PPC_INST_ISEL_MASK 0xfc3e > #define PPC_INST_LDARX 0x7ca8 > +#define PPC_INST_STDCX 0x7c0001ad > #define PPC_INST_LSWI 0x7c0004aa > #define PPC_INST_LSWX 0x7c00042a > #define PPC_INST_LWARX 0x7c28 > +#define PPC_INST_STWCX 0x7c00012d > #define PPC_INST_LWSYNC0x7c2004ac > #define PPC_INST_SYNC 0x7c0004ac > #define PPC_INST_SYNC_MASK 0xfc0007fe > @@ -211,8 +213,11 @@ > #define PPC_INST_LBZ 0x8800 > #define PPC_INST_LD0xe800 > #define PPC_INST_LHZ 0xa000 > -#define PPC_INST_LHBRX 0x7c00062c > #define PPC_INST_LWZ 0x8000 > +#define PPC_INST_LHBRX 0x7c00062c > +#define PPC_INST_LDBRX 0x7c000428 > +#define PPC_INST_STB 0x9800 > +#define PPC_INST_STH 0xb000 > #define PPC_INST_STD 0xf800 > #define PPC_INST_STDU 0xf801 > #define PPC_INST_STW 0x9000 > @@ -221,22 +226,34 @@ > #define PPC_INST_MTLR 0x7c0803a6 > #define PPC_INST_CMPWI 0x2c00 > #define PPC_INST_CMPDI 0x2c20 > +#define PPC_INST_CMPW 0x7c00 > +#define PPC_INST_CMPD 0x7c20 > #define PPC_INST_CMPLW 0x7c40 > +#define PPC_INST_CMPLD 0x7c200040 > #define PPC_INST_CMPLWI0x2800 > +#define PPC_INST_CMPLDI0x2820 > #
Re: [PATCH] powerpc/mm: Use jump label to speed up radix_enabled check
On Wed, 2016-04-27 at 12:30 +0530, Aneesh Kumar K.V wrote: > Benjamin Herrenschmidt writes: > > > > > On Wed, 2016-04-27 at 11:00 +1000, Balbir Singh wrote: > > > > > > Just basic testing across CPUs with various mm features > > > enabled/disabled. Just for sanity > > I still don't think it's worth scattering the change. Either the jump > > label works or it doesn't ... The only problem is make sure we identify > > all the pre-boot ones but that's about it. > > > There are two ways to do this. One is to follow the approach listed > below done by Kevin, which is to do the jump_label_init early during boot and > switch both cpu and mmu feature check to plain jump label. > > http://mid.gmane.org/1440415228-8006-1-git-send-email-haoke...@gmail.com > > I already found one use case of cpu_has_feature before that > jump_label_init. In this approach we need to carefully audit all the > cpu/mmu_has_feature calls to make sure they don't get called before > jump_label_init. A missed conversion mean we miss a cpu/mmu feature > check. > > > Other option is to follow the patch I posted above, with the simple > change of renaming mmu_feature_enabled to mmu_has_feature. So we can > use it in early boot without really worrying about when we init jump > label. > > What do you suggest we follow ? So I really don't like your patch, sorry :-( It adds a whole new section "_in_c", duplicates a lot of infrastructure somewhat differently etc... ugh. I'd rather we follow Kevin's approach and convert all the CPU/MMU/... feature things to static keys in C. There aren't that many that need to be done really early on, we can audit them. I would suggest doing: 1- Add __mmu_has_feature/__cpu_has_feature/... which initially is identical to the current one (or just make the current one use the __ variant). 2- Convert selectively the early boot stuff to use __. There aren't *that* many, I can help you audit them 3- Add the static key version for all the non __ Do you have time or should I look into this ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction
On Thu, Jun 02, 2016 at 07:38:58AM -0500, Shreyas B. Prabhu wrote: ... > +/* Power Management - PSSCR Fields */ It might be nice to give the full name of the register, as below with the FPSCR. > +#define PSSCR_RL_MASK0x000F > +#define PSSCR_MTL_MASK 0x00F0 > +#define PSSCR_TR_MASK0x0300 > +#define PSSCR_PSLL_MASK 0x000F > +#define PSSCR_EC 0x0010 > +#define PSSCR_ESL0x0020 > +#define PSSCR_SD 0x0040 > + > + > /* Floating Point Status and Control Register (FPSCR) Fields */ Cheers, Sam. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev