[RFC] Implementing HUGEPAGE on MPC 8xx

2016-06-08 Thread Christophe Leroy

Hello,

MPC 8xx has several page sizes: 4k, 16k, 512k and 8M.
Today, 4k and 16k sizes are implemented as normal page sizes and 8M is 
used for mapping linear memory space in kernel.


I'd like to implement HUGE PAGE to reduce TLB misses from user apps.

In 4k mode, PAGE offset is 12 bits, PTE offset is 10 bits and PGD offset 
is 10 bits
In 16k mode, PAGE offset is 14 bits, PTE offset is 12 bits and PGD 
offset is 6 bits


In 4k mode, we could use 512k HUGE PAGE and have a HPAGE offset of 19 
bits so HPTE offset of 3 bits and PGD offset of 10 bits

In 16k mode, we could use both 512k HUGE PAGE and 8M HUGE PAGE and have:
* For 512k: a HPAGE offset of 19 bits so HPTE offset of 7 bits and PGD 
offset of 6 bits
* For 8M: a HPAGE offset of 23 bits so HPTE offset of 3 bits and PGD 
offset of 6 bits


In see in the current ppc kernel that for PPC32, SYS_SUPPORTS_HUGETLBFS 
is selected only if we have PHYS_64BIT.
What is the reason for only implementing HUGETLBFS with 64 bits phys 
addresses ?


From your point of view, what would be the best approach to extend 
support of HUGE PAGES to PPC_8xx ?
Would the good starting point be to implement a hugepagetlb-8xx.c from 
hugepagetlb-book3e.c ?


Christophe

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH, RFC] cxl: Add support for CAPP DMA mode

2016-06-08 Thread Stewart Smith
Ian Munsie  writes:
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 3a5ea82..5a42e98 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2793,7 +2793,9 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t 
> mode)
>   pe_info(pe, "Switching PHB to CXL\n");
>  
>   rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number);
> - if (rc)
> + if (rc == OPAL_UNSUPPORTED)
> + dev_err(&dev->dev, "Required cxl mode not supported by firmware 
> - update skiboot\n");
> + else if (rc)
>   dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode failed:
> %i\n", rc);

Could mention version required, which would be skiboot 5.3.x or higher.

This could be something we start doing - there's enough random bits of
functionality we could tell the user exactly what they have to upgrade
to to have work.

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v2 0/6] vfio-pci: Add support for mmapping MSI-X table

2016-06-08 Thread Auger Eric
Hi Yongji,

Le 02/06/2016 à 08:09, Yongji Xie a écrit :
> Current vfio-pci implementation disallows to mmap the page
> containing MSI-X table in case that users can write directly
> to MSI-X table and generate an incorrect MSIs.
> 
> However, this will cause some performance issue when there
> are some critical device registers in the same page as the 
> MSI-X table. We have to handle the mmio access to these
> registers in QEMU emulation rather than in guest.
> 
> To solve this issue, this series allows to expose MSI-X table
> to userspace when hardware enables the capability of interrupt
> remapping which can ensure that a given PCI device can only
> shoot the MSIs assigned for it. And we introduce a new bus_flags
> PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
> for different archs.
> 
> The patch 3 are based on the proposed patchset[1].
You may have noticed I sent a respin of [1] yesterday:
http://www.gossamer-threads.com/lists/linux/kernel/2455187.

Unfortunately you will see I removed the patch defining the new
msi_domain_info MSI_FLAG_IRQ_REMAPPING flag you rely on in this series.
I did so because I was not using it anymore. At the beginning this was
used to detect whether the MSI assignment was safe but this
method was covering cases where the MSI controller was
upstream to the IOMMU. So now I rely on a mechanism where MSI controller
are supposed to register their MSI doorbells and tag whether it is safe.

I don't know yet how this change will be welcomed though. Depending
on reviews/discussions, might happen we revert to the previous flag.

If you need the feature you can embed the used patches in your series and
follow the review process separately. Sorry for the setback.

Best Regards

Eric
> 
> Changelog v2: 
> - Make the commit log more clear
> - Replace pci_bus_check_msi_remapping() with pci_bus_msi_isolated()
>   so that we could clearly know what the function does
> - Set PCI_BUS_FLAGS_MSI_REMAP in pci_create_root_bus() instead
>   of iommu_bus_notifier()
> - Reserve VFIO_REGION_INFO_FLAG_CAPS when we allow to mmap MSI-X
>   table so that we can know whether we allow to mmap MSI-X table
>   in QEMU
> 
> [1] 
> https://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg1138820.html
> 
> Yongji Xie (6):
>   PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag
>   PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller enables IRQ remapping
>   PCI: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping
>   iommu: Set PCI_BUS_FLAGS_MSI_REMAP on iommu driver initialization
>   pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge
>   vfio-pci: Allow to expose MSI-X table to userspace if interrupt remapping 
> is enabled
> 
>  arch/powerpc/platforms/powernv/pci-ioda.c |8 
>  drivers/iommu/iommu.c |8 
>  drivers/pci/msi.c |   15 +++
>  drivers/pci/probe.c   |7 +++
>  drivers/vfio/pci/vfio_pci.c   |   17 ++---
>  drivers/vfio/pci/vfio_pci_rdwr.c  |3 ++-
>  include/linux/msi.h   |5 -
>  include/linux/pci.h   |1 +
>  8 files changed, 59 insertions(+), 5 deletions(-)
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 0/7] crypto: talitos - implementation of AEAD for SEC1

2016-06-08 Thread Herbert Xu
On Mon, Jun 06, 2016 at 01:20:31PM +0200, Christophe Leroy wrote:
> This set of patches provides the implementation of AEAD for
> talitos SEC1.

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v2 0/6] vfio-pci: Add support for mmapping MSI-X table

2016-06-08 Thread Yongji Xie

Hi, Eric

On 2016/6/8 15:41, Auger Eric wrote:


Hi Yongji,

Le 02/06/2016 à 08:09, Yongji Xie a écrit :

Current vfio-pci implementation disallows to mmap the page
containing MSI-X table in case that users can write directly
to MSI-X table and generate an incorrect MSIs.

However, this will cause some performance issue when there
are some critical device registers in the same page as the
MSI-X table. We have to handle the mmio access to these
registers in QEMU emulation rather than in guest.

To solve this issue, this series allows to expose MSI-X table
to userspace when hardware enables the capability of interrupt
remapping which can ensure that a given PCI device can only
shoot the MSIs assigned for it. And we introduce a new bus_flags
PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
for different archs.

The patch 3 are based on the proposed patchset[1].

You may have noticed I sent a respin of [1] yesterday:
http://www.gossamer-threads.com/lists/linux/kernel/2455187.

Unfortunately you will see I removed the patch defining the new
msi_domain_info MSI_FLAG_IRQ_REMAPPING flag you rely on in this series.
I did so because I was not using it anymore. At the beginning this was
used to detect whether the MSI assignment was safe but this
method was covering cases where the MSI controller was
upstream to the IOMMU. So now I rely on a mechanism where MSI controller
are supposed to register their MSI doorbells and tag whether it is safe.

I don't know yet how this change will be welcomed though. Depending
on reviews/discussions, might happen we revert to the previous flag.

If you need the feature you can embed the used patches in your series and
follow the review process separately. Sorry for the setback.


Thanks for your notification. I'd better wait until your patches get
settled. Then I could exactly know which way we should use to test the
capability of interrupt remapping on ARM in my series.

Thanks,
Yongji

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 0/3] POWER9 Load Monitor Support

2016-06-08 Thread Michael Neuling
This patches series adds support for the POWER9 Load Monitor
instruction (ldmx) based on work from Jack Miller.

The first patch is a clean up of the FSCR handling. The second patch
adds the actual ldmx support to the kernel. The third patch is a
couple of ldmx selftests.

v6:
  - PATCH 1/3:
- Suggestions from mpe.
- Init the FSCR using existing INIT_THREAD macro rather than
  init_fscr() function.
- Set fscr when taking DSCR exception in
  facility_unavailable_exception().
  - PATCH 2/3:
- Remove erroneous semicolons in restore_sprs().
  - PATCH 3/3:
- no change.

v5:
  - PATCH 1/3:
- Change FSCR cleanup more extensive.
  - PATCH 2/3:
- Moves FSCR_LM clearing to new init_fscr().
  - PATCH 3/3:
- Added test cases to .gitignore.
- Removed test again PPC_FEATURE2_EBB since it's not needed.
- Added parenthesis on input parameter usage for LDMX() macro.

Jack Miller (2):
  powerpc: Load Monitor Register Support
  powerpc: Load Monitor Register Tests

Michael Neuling (1):
  powerpc: Improve FSCR init and context switching
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 1/3] powerpc: Improve FSCR init and context switching

2016-06-08 Thread Michael Neuling
This fixes a few issues with FSCR init and switching.

In this patch:
powerpc: Create context switch helpers save_sprs() and restore_sprs()
Author: Anton Blanchard 
commit 152d523e6307c7152f9986a542f873b5c5863937
We moved the setting of the FSCR register from inside an
CPU_FTR_ARCH_207S section to inside just a CPU_FTR_ARCH_DSCR section.
Hence we are setting FSCR on POWER6/7 where the FSCR doesn't
exist. This is harmless but we shouldn't do it.

Also, we can simplify the FSCR context switch. We don't need to go
through the calculation involving dscr_inherit. We can just restore
what we saved last time.

Also, we currently don't explicitly init the FSCR for userspace
applications. Currently we init FSCR on boot in __init_fscr: and then
the first task inherits based on that. Currently it works but is
delicate. This adds the initial fscr value to INIT_THREAD to
explicitly set the FSCR for userspace applications and removes
__init_fscr: boot time init.

Based on patch by Jack Miller.

Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h  |  1 +
 arch/powerpc/kernel/cpu_setup_power.S | 10 --
 arch/powerpc/kernel/process.c | 12 
 arch/powerpc/kernel/traps.c   |  3 ++-
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 009fab1..1833fe9 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -347,6 +347,7 @@ struct thread_struct {
.fs = KERNEL_DS, \
.fpexc_mode = 0, \
.ppr = INIT_PPR, \
+   .fscr = FSCR_TAR | FSCR_EBB \
 }
 #endif
 
diff --git a/arch/powerpc/kernel/cpu_setup_power.S 
b/arch/powerpc/kernel/cpu_setup_power.S
index 584e119..75f98c8 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -49,7 +49,6 @@ _GLOBAL(__restore_cpu_power7)
 
 _GLOBAL(__setup_cpu_power8)
mflrr11
-   bl  __init_FSCR
bl  __init_PMU
bl  __init_hvmode_206
mtlrr11
@@ -67,7 +66,6 @@ _GLOBAL(__setup_cpu_power8)
 
 _GLOBAL(__restore_cpu_power8)
mflrr11
-   bl  __init_FSCR
bl  __init_PMU
mfmsr   r3
rldicl. r0,r3,4,63
@@ -86,7 +84,6 @@ _GLOBAL(__restore_cpu_power8)
 
 _GLOBAL(__setup_cpu_power9)
mflrr11
-   bl  __init_FSCR
bl  __init_hvmode_206
mtlrr11
beqlr
@@ -102,7 +99,6 @@ _GLOBAL(__setup_cpu_power9)
 
 _GLOBAL(__restore_cpu_power9)
mflrr11
-   bl  __init_FSCR
mfmsr   r3
rldicl. r0,r3,4,63
mtlrr11
@@ -155,12 +151,6 @@ __init_LPCR:
isync
blr
 
-__init_FSCR:
-   mfspr   r3,SPRN_FSCR
-   ori r3,r3,FSCR_TAR|FSCR_DSCR|FSCR_EBB
-   mtspr   SPRN_FSCR,r3
-   blr
-
 __init_HFSCR:
mfspr   r3,SPRN_HFSCR
ori r3,r3,HFSCR_TAR|HFSCR_TM|HFSCR_BHRB|HFSCR_PM|\
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index e2f12cb..74ea8db 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1023,18 +1023,11 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_DSCR)) {
u64 dscr = get_paca()->dscr_default;
-   u64 fscr = old_thread->fscr & ~FSCR_DSCR;
-
-   if (new_thread->dscr_inherit) {
+   if (new_thread->dscr_inherit)
dscr = new_thread->dscr;
-   fscr |= FSCR_DSCR;
-   }
 
if (old_thread->dscr != dscr)
mtspr(SPRN_DSCR, dscr);
-
-   if (old_thread->fscr != fscr)
-   mtspr(SPRN_FSCR, fscr);
}
 
if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
@@ -1045,6 +1038,9 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
if (old_thread->ebbrr != new_thread->ebbrr)
mtspr(SPRN_EBBRR, new_thread->ebbrr);
 
+   if (old_thread->fscr != new_thread->fscr)
+   mtspr(SPRN_FSCR, new_thread->fscr);
+
if (old_thread->tar != new_thread->tar)
mtspr(SPRN_TAR, new_thread->tar);
}
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 9229ba6..a4b00ee 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1418,7 +1418,8 @@ void facility_unavailable_exception(struct pt_regs *regs)
rd = (instword >> 21) & 0x1f;
current->thread.dscr = regs->gpr[rd];
current->thread.dscr_inherit = 1;
-   mtspr(SPRN_FSCR, value | FSCR_DSCR);
+   current->thread.fscr = value | FSCR_DSCR;
+   mtspr(SPRN_FSCR, current->thread.fscr);
}
 

[PATCH v6 2/3] powerpc: Load Monitor Register Support

2016-06-08 Thread Michael Neuling
From: Jack Miller 

This enables new registers, LMRR and LMSER, that can trigger an EBB in
userspace code when a monitored load (via the new ldmx instruction)
loads memory from a monitored space. This facility is controlled by a
new FSCR bit, LM.

This patch disables the FSCR LM control bit on task init and enables
that bit when a load monitor facility unavailable exception is taken
for using it. On context switch, this bit is then used to determine
whether the two relevant registers are saved and restored. This is
done lazily for performance reasons.

Signed-off-by: Jack Miller 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h |  2 ++
 arch/powerpc/include/asm/reg.h   |  5 +
 arch/powerpc/kernel/process.c| 18 ++
 arch/powerpc/kernel/traps.c  |  4 
 4 files changed, 29 insertions(+)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 1833fe9..ac7670d 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -314,6 +314,8 @@ struct thread_struct {
unsigned long   mmcr2;
unsignedmmcr0;
unsignedused_ebb;
+   unsigned long   lmrr;
+   unsigned long   lmser;
 #endif
 };
 
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a0948f4..ce44fe2 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -282,6 +282,8 @@
 #define SPRN_HRMOR 0x139   /* Real mode offset register */
 #define SPRN_HSRR0 0x13A   /* Hypervisor Save/Restore 0 */
 #define SPRN_HSRR1 0x13B   /* Hypervisor Save/Restore 1 */
+#define SPRN_LMRR  0x32D   /* Load Monitor Region Register */
+#define SPRN_LMSER 0x32E   /* Load Monitor Section Enable Register */
 #define SPRN_IC0x350   /* Virtual Instruction Count */
 #define SPRN_VTB   0x351   /* Virtual Time Base */
 #define SPRN_LDBAR 0x352   /* LD Base Address Register */
@@ -291,6 +293,7 @@
 #define SPRN_PMCR  0x374   /* Power Management Control Register */
 
 /* HFSCR and FSCR bit numbers are the same */
+#define FSCR_LM_LG 11  /* Enable Load Monitor Registers */
 #define FSCR_TAR_LG8   /* Enable Target Address Register */
 #define FSCR_EBB_LG7   /* Enable Event Based Branching */
 #define FSCR_TM_LG 5   /* Enable Transactional Memory */
@@ -300,10 +303,12 @@
 #define FSCR_VECVSX_LG 1   /* Enable VMX/VSX  */
 #define FSCR_FP_LG 0   /* Enable Floating Point */
 #define SPRN_FSCR  0x099   /* Facility Status & Control Register */
+#define   FSCR_LM  __MASK(FSCR_LM_LG)
 #define   FSCR_TAR __MASK(FSCR_TAR_LG)
 #define   FSCR_EBB __MASK(FSCR_EBB_LG)
 #define   FSCR_DSCR__MASK(FSCR_DSCR_LG)
 #define SPRN_HFSCR 0xbe/* HV=1 Facility Status & Control Register */
+#define   HFSCR_LM __MASK(FSCR_LM_LG)
 #define   HFSCR_TAR__MASK(FSCR_TAR_LG)
 #define   HFSCR_EBB__MASK(FSCR_EBB_LG)
 #define   HFSCR_TM __MASK(FSCR_TM_LG)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 74ea8db..2e22f60 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1009,6 +1009,14 @@ static inline void save_sprs(struct thread_struct *t)
 */
t->tar = mfspr(SPRN_TAR);
}
+
+   if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+   /* Conditionally save Load Monitor registers, if enabled */
+   if (t->fscr & FSCR_LM) {
+   t->lmrr = mfspr(SPRN_LMRR);
+   t->lmser = mfspr(SPRN_LMSER);
+   }
+   }
 #endif
 }
 
@@ -1044,6 +1052,16 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
if (old_thread->tar != new_thread->tar)
mtspr(SPRN_TAR, new_thread->tar);
}
+
+   if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+   /* Conditionally restore Load Monitor registers, if enabled */
+   if (new_thread->fscr & FSCR_LM) {
+   if (old_thread->lmrr != new_thread->lmrr)
+   mtspr(SPRN_LMRR, new_thread->lmrr);
+   if (old_thread->lmser != new_thread->lmser)
+   mtspr(SPRN_LMSER, new_thread->lmser);
+   }
+   }
 #endif
 }
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a4b00ee..aabdeac 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1376,6 +1376,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
[FSCR_TM_LG] = "TM",
[FSCR_EBB_LG] = "EBB",
[FSCR_TAR_LG] = "TAR",
+   [FSCR_LM_LG] = "LM",
};
char *facility = "unknown";
u64 value;
@@ -1433,6 +1434,9 @@ void facility_unavailable_exception(struct pt_regs *regs)
emulate_single_step(re

[PATCH v6 3/3] powerpc: Load Monitor Register Tests

2016-06-08 Thread Michael Neuling
From: Jack Miller 

Adds two tests. One is a simple test to ensure that the new registers
LMRR and LMSER are properly maintained. The other actually uses the
existing EBB test infrastructure to test that LMRR and LMSER behave as
documented.

Signed-off-by: Jack Miller 
Signed-off-by: Michael Neuling 
---
 tools/testing/selftests/powerpc/pmu/ebb/.gitignore |   2 +
 tools/testing/selftests/powerpc/pmu/ebb/Makefile   |   2 +-
 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c  | 143 +
 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h  |  39 ++
 .../selftests/powerpc/pmu/ebb/ebb_lmr_regs.c   |  37 ++
 tools/testing/selftests/powerpc/reg.h  |   5 +
 6 files changed, 227 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr_regs.c

diff --git a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore 
b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
index 42bddbe..44b7df1 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
+++ b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
@@ -20,3 +20,5 @@ back_to_back_ebbs_test
 lost_exception_test
 no_handler_test
 cycles_with_mmcr2_test
+ebb_lmr
+ebb_lmr_regs
\ No newline at end of file
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile 
b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
index 8d2279c4..6b0453e 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile
+++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
@@ -14,7 +14,7 @@ TEST_PROGS := reg_access_test event_attributes_test 
cycles_test   \
 fork_cleanup_test ebb_on_child_test\
 ebb_on_willing_child_test back_to_back_ebbs_test   \
 lost_exception_test no_handler_test\
-cycles_with_mmcr2_test
+cycles_with_mmcr2_test ebb_lmr ebb_lmr_regs
 
 all: $(TEST_PROGS)
 
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c 
b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
new file mode 100644
index 000..c47ebd5
--- /dev/null
+++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
@@ -0,0 +1,143 @@
+/*
+ * Copyright 2016, Jack Miller, IBM Corp.
+ * Licensed under GPLv2.
+ */
+
+#include 
+#include 
+
+#include "ebb.h"
+#include "ebb_lmr.h"
+
+#define SIZE   (32 * 1024 * 1024)  /* 32M */
+#define LM_SIZE0   /* Smallest encoding, 32M */
+
+#define SECTIONS   64  /* 1 per bit in LMSER */
+#define SECTION_SIZE   (SIZE / SECTIONS)
+#define SECTION_LONGS   (SECTION_SIZE / sizeof(long))
+
+static unsigned long *test_mem;
+
+static int lmr_count = 0;
+
+void ebb_lmr_handler(void)
+{
+   lmr_count++;
+}
+
+void ldmx_full_section(unsigned long *mem, int section)
+{
+   unsigned long *ptr;
+   int i;
+
+   for (i = 0; i < SECTION_LONGS; i++) {
+   ptr = &mem[(SECTION_LONGS * section) + i];
+   ldmx((unsigned long) &ptr);
+   ebb_lmr_reset();
+   }
+}
+
+unsigned long section_masks[] = {
+   0x8000,
+   0xFF00,
+   0x000F7000,
+   0x8001,
+   0xF0F0F0F0F0F0F0F0,
+   0x0F0F0F0F0F0F0F0F,
+   0x0
+};
+
+int ebb_lmr_section_test(unsigned long *mem)
+{
+   unsigned long *mask = section_masks;
+   int i;
+
+   for (; *mask; mask++) {
+   mtspr(SPRN_LMSER, *mask);
+   printf("Testing mask 0x%016lx\n", mfspr(SPRN_LMSER));
+
+   for (i = 0; i < 64; i++) {
+   lmr_count = 0;
+   ldmx_full_section(mem, i);
+   if (*mask & (1UL << (63 - i)))
+   FAIL_IF(lmr_count != SECTION_LONGS);
+   else
+   FAIL_IF(lmr_count);
+   }
+   }
+
+   return 0;
+}
+
+int ebb_lmr(void)
+{
+   int i;
+
+   SKIP_IF(!lmr_is_supported());
+
+   setup_ebb_handler(ebb_lmr_handler);
+
+   ebb_global_enable();
+
+   FAIL_IF(posix_memalign((void **)&test_mem, SIZE, SIZE) != 0);
+
+   mtspr(SPRN_LMSER, 0);
+
+   FAIL_IF(mfspr(SPRN_LMSER) != 0);
+
+   mtspr(SPRN_LMRR, ((unsigned long)test_mem | LM_SIZE));
+
+   FAIL_IF(mfspr(SPRN_LMRR) != ((unsigned long)test_mem | LM_SIZE));
+
+   /* Read every single byte to ensure we get no false positives */
+   for (i = 0; i < SECTIONS; i++)
+   ldmx_full_section(test_mem, i);
+
+   FAIL_IF(lmr_count != 0);
+
+   /* Turn on the first section */
+
+   mtspr(SPRN_LMSER, (1UL << 63));
+   FAIL_IF(mfspr(SPRN_LMSER) != (1UL << 63));
+
+   /* Enable LM (BESCR) */
+
+   mtspr(SPRN_BESCR, mfspr(SPRN_BESCR) | BESCR_LME);
+   FAIL_IF(!(mfspr(SPRN_BESCR) & BESCR_LME));
+
+   ldmx((unsigned long)&test_mem);
+
+   FAIL_IF(lmr_count != 1);

Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote:
> On 31/05/16 20:32, Michael Ellerman wrote:
> > On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote:
> > > On 31.05.2016 12:04, Michael Ellerman wrote:
> > > > On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote:
> > > > > If we do not provide the PVR for POWER8NVL, a guest on this
> > > > > system currently ends up in PowerISA 2.06 compatibility mode on
> > > > > KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet.
> > > > > So some new instructions from POWER8 (like "mtvsrd") get disabled
> > > > > for the guest, resulting in crashes when using code compiled
> > > > > explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC).
> > > > > 
> > > > > Signed-off-by: Thomas Huth 
> > > > 
> > > > So this should say:
> > > > 
> > > >   Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor")
> > > > 
> > > > And therefore:
> > > > 
> > > >   Cc: sta...@vger.kernel.org # v4.0+
> > > > 
> > > > Am I right?
> > > 
> > > Right. (At least for virtualized systems ... for bare-metal systems,
> > > that original patch was enough). So shall I resubmit my patch with these
> > > two lines, or could you add them when you pick this patch up?
> > 
> > Thanks, I'll add them here.
> 
> Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well?

Yep, patch sent this morning.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call

2016-06-08 Thread Thomas Huth
On 08.06.2016 03:14, Balbir Singh wrote:
> 
> On 31/05/16 20:32, Michael Ellerman wrote:
>> On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote:
>>> On 31.05.2016 12:04, Michael Ellerman wrote:
 On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote:
> If we do not provide the PVR for POWER8NVL, a guest on this
> system currently ends up in PowerISA 2.06 compatibility mode on
> KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet.
> So some new instructions from POWER8 (like "mtvsrd") get disabled
> for the guest, resulting in crashes when using code compiled
> explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC).
>
> Signed-off-by: Thomas Huth 

 So this should say:

   Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor")

 And therefore:

   Cc: sta...@vger.kernel.org # v4.0+

 Am I right?
>>>
>>> Right. (At least for virtualized systems ... for bare-metal systems,
>>> that original patch was enough). So shall I resubmit my patch with these
>>> two lines, or could you add them when you pick this patch up?
>>
>> Thanks, I'll add them here.
> 
> Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well?

D'oh! You're right, that needs to be changed, too! I'll send a fixup
patch once I've tested it...

By the way, there seems to be already a check for
ibm_architecture_vec[IBM_ARCH_VEC_NRCORES_OFFSET] != NR_CPUS in
prom_send_capabilities(), but it only prints out a warning which easily
gets lost in the kernel log ... I wonder whether we should rather stop
the boot there instead to catch this problem more easily?

 Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix IBM_ARCH_VEC_NRCORES_OFFSET value

2016-06-08 Thread Thomas Huth
On 08.06.2016 00:51, Benjamin Herrenschmidt wrote:
> Commit 7cc851039d643a2ee7df4d18177150f2c3a484f5
> "powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support 
> call"
> introduced a regression by adding fields to the beginning of the
> ibm_architecture_vec structure without updating IBM_ARCH_VEC_NRCORES_OFFSET.
> 
> This causes the kernel to print a warning at boot and to fail to adjust
> the number of cores based on the number of threads before doing the CAS
> call to firmware.
> 
> This is quite a fragile piece of code sadly, we should try to find a way
> to avoid that hard coded offset at some point, but for now this fixes it.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> ---
> 
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index ccd2037..6ee4b72 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -719,7 +719,7 @@ unsigned char ibm_architecture_vec[] = {
>* must match by the macro below. Update the definition if
>* the structure layout changes.
>*/
> -#define IBM_ARCH_VEC_NRCORES_OFFSET  125
> +#define IBM_ARCH_VEC_NRCORES_OFFSET  133
>   W(NR_CPUS), /* number of cores supported */
>   0,
>   0,

Yes, that should be the right offset now!

Please also add "Cc: sta...@vger.kernel.org # v4.0+" to the patch since
the commit 7cc851039d64 did have that as well.

And sorry for breaking this!

Reviewed-by: Thomas Huth 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] of: fix autoloading due to broken modalias with no 'compatible'

2016-06-08 Thread Michael Ellerman
On Mon, 2016-06-06 at 18:48 +0200, Wolfram Sang wrote:
> Because of an improper dereference, a stray 'C' character was output to
> the modalias when no 'compatible' was specified. This is the case for
> some old PowerMac drivers which only set the 'name' property. Fix it to
> let them match again.
> 
> Reported-by: Mathieu Malaterre 
> Signed-off-by: Wolfram Sang 
> Tested-by: Mathieu Malaterre 
> Cc: Philipp Zabel 
> Cc: Andreas Schwab 
> Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for 
> cross compiling")
> ---
> 
> I think it makes sense if this goes in via ppc (with stable tag added).
> D'accord?

Sure, I've grabbed it.

I added:

Cc: sta...@vger.kernel.org # v3.9+

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call

2016-06-08 Thread Thomas Huth
On 08.06.2016 12:44, Michael Ellerman wrote:
> On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote:
>> On 31/05/16 20:32, Michael Ellerman wrote:
>>> On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote:
 On 31.05.2016 12:04, Michael Ellerman wrote:
> On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote:
>> If we do not provide the PVR for POWER8NVL, a guest on this
>> system currently ends up in PowerISA 2.06 compatibility mode on
>> KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet.
>> So some new instructions from POWER8 (like "mtvsrd") get disabled
>> for the guest, resulting in crashes when using code compiled
>> explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC).
>>
>> Signed-off-by: Thomas Huth 
>
> So this should say:
>
>   Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor")
>
> And therefore:
>
>   Cc: sta...@vger.kernel.org # v4.0+
>
> Am I right?

 Right. (At least for virtualized systems ... for bare-metal systems,
 that original patch was enough). So shall I resubmit my patch with these
 two lines, or could you add them when you pick this patch up?
>>>
>>> Thanks, I'll add them here.
>>
>> Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well?
> 
> Yep, patch sent this morning.

Ok, looks like BenH already posted a patch ... anyway, what do you think
about aborting the boot process here in case cores != NR_CPUS, rather
than just printing out a small warning which can easily get lost in the
kernel log?

 Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V10 00/28] Add new powerpc specific ELF core notes

2016-06-08 Thread Michael Ellerman
On Mon, 2016-06-06 at 14:27 +0530, Anshuman Khandual wrote:
> On 06/03/2016 03:56 AM, Cyril Bur wrote:
> > 
> > At the moment is is rather confusing since pt_regs is the always the 'live'
> > state and theres a ckpt_regs that is the pt_regs for the checkpointed state.
> > FPU/VMX/VSX is done differently which is really only creating confusion so 
> > I'm changing
> > it to do the same at for pt_regs/ckpt_regs. Ultimately this is part of more 
> > work from me
> 
> But that changes the basic semantics on which this ptrace series is written.
> With this change, a significant part of the ptrace series has to be changed.

Yes, that's the whole point.

In fact half of the code should vanish, because the only difference between
copying the live or checkpointed state out to userspace should be which regs
struct you pass to the function.

> Its just an improvement on how we store running and check pointed values for
> FP/VSX/VMX registers inside the kernel. How does it improve ptrace interface
> from the user point of view ? If not, then why this change is necessary for
> the acceptance of this patch series ?

Because the clean-ups never happen once a series is merged, and I'm left to deal
with it.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()

2016-06-08 Thread Michael Ellerman
On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote:
> On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote:
> > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote:
> > > 
> > > Why the move to in-line this implementation? It looks like a fairly big
> > > function.
> > 
> > I agree it's not pretty.
> 
> > I'm not beholden to v3 though if you hate it.
> 
> I don't mind; its just that I am in a similar boat with qspinlock and
> chose the other option. So I just figured I'd ask :-)

OK. I'll go with inline and we'll see which version gets "cleaned-up" by a
janitor first ;)

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 13:17 +0200, Thomas Huth wrote:
> On 08.06.2016 12:44, Michael Ellerman wrote:
> > On Wed, 2016-06-08 at 11:14 +1000, Balbir Singh wrote:
> > > Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well?
> > 
> > Yep, patch sent this morning.
> 
> Ok, looks like BenH already posted a patch ...

And me before him :)

To be clear I'm not blaming you in any way for this, the existing code is
terrible and incredibly fragile.

> anyway, what do you think about aborting the boot process here in case cores
> != NR_CPUS, rather than just printing out a small warning which can easily get
> lost in the kernel log?

Yeah I agree it's easy to miss. And it's not part of dmesg (because it's from
prom_init()), so you *only* see it if you're actually staring at the console as
it boots (which is why my boot tests missed it).

I actually have plans to rewrite the whole thing to make it robust, so that
should avoid it ever being a problem again.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 12:58 +0200, Christian Zigotzky wrote:
> On 08 June 2016 at 04:52 AM, Michael Ellerman wrote:
> > On Tue, 2016-06-07 at 22:17 +0200, Christian Zigotzky wrote:
> > > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
> > > commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
> > > Author: Aneesh Kumar K.V 
> > > Date:   Fri Apr 29 23:26:09 2016 +1000
> > > 
> > >  powerpc/mm/radix: Add checks in slice code to catch radix usage
> > > 
> > >  Radix doesn't need slice support. Catch incorrect usage of slice code
> > >  when radix is enabled.
> > > 
> > >  Signed-off-by: Aneesh Kumar K.V 
> > >  Signed-off-by: Michael Ellerman 
> > > 
> > Hmm, I find that hard to believe. But maybe I'm missing something.
> > 
> > Can you checkout Linus' master and then revert that commit?
> > 
> $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> linux-git
> $ git checkout
> Your branch is up-to-date with 'origin/master'.
> 
> $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 -m 1
> error: Mainline was specified but commit 
> 764041e0f43cc7846f6d8eb246d65b53cc06c764 is not a merge.
> fatal: revert failed
> 
> How can I checkout Linus' master and then revert that commit?

It's not a merge, so just plain git revert:

  $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
linux-git
  $ cd linux-git
  $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764
  [master 5dd9737a173e] Revert "powerpc/mm/radix: Add checks in slice code to 
catch radix usage"
   1 file changed, 16 deletions(-)

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 8/8] dmaengine: Remove site specific OOM error messages on kzalloc

2016-06-08 Thread Linus Walleij
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin  wrote:

> If kzalloc() fails it will issue it's own error message including
> a dump_stack(). So remove the site specific error messages.
>
> Signed-off-by: Peter Griffin 

Acked-by: Linus Walleij 

A few subsystems may use a cleanup like this...
I wonder how many unnecessary prints I've introduced
myself :P

Yours,
Linus Walleij
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/8] dmaengine: coh901318: Only calculate residue if txstate exists.

2016-06-08 Thread Linus Walleij
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin  wrote:

> There is no point in calculating the residue if there is no
> txstate to store the value.
>
> Signed-off-by: Peter Griffin 

Acked-by: Linus Walleij 

Yours,
Linus Walleij
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 8/8] dmaengine: Remove site specific OOM error messages on kzalloc

2016-06-08 Thread Jon Hunter

On 07/06/16 18:38, Peter Griffin wrote:
> If kzalloc() fails it will issue it's own error message including
> a dump_stack(). So remove the site specific error messages.
> 
> Signed-off-by: Peter Griffin 
> ---
>  drivers/dma/amba-pl08x.c| 10 +-
>  drivers/dma/bestcomm/bestcomm.c |  2 --
>  drivers/dma/edma.c  | 16 
>  drivers/dma/fsldma.c|  2 --
>  drivers/dma/k3dma.c | 10 --
>  drivers/dma/mmp_tdma.c  |  5 ++---
>  drivers/dma/moxart-dma.c|  4 +---
>  drivers/dma/nbpfaxi.c   |  5 ++---
>  drivers/dma/pl330.c |  5 +
>  drivers/dma/ppc4xx/adma.c   |  2 --
>  drivers/dma/s3c24xx-dma.c   |  5 +
>  drivers/dma/sh/shdmac.c |  9 ++---
>  drivers/dma/sh/sudmac.c |  9 ++---
>  drivers/dma/sirf-dma.c  |  5 ++---
>  drivers/dma/ste_dma40.c |  4 +---
>  drivers/dma/tegra20-apb-dma.c   | 11 +++
>  drivers/dma/timb_dma.c  |  8 ++--
>  17 files changed, 28 insertions(+), 84 deletions(-)

[snip]

> diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
> index 7f4af8c..032884f 100644
> --- a/drivers/dma/tegra20-apb-dma.c
> +++ b/drivers/dma/tegra20-apb-dma.c
> @@ -300,10 +300,8 @@ static struct tegra_dma_desc *tegra_dma_desc_get(
>  
>   /* Allocate DMA desc */
>   dma_desc = kzalloc(sizeof(*dma_desc), GFP_NOWAIT);
> - if (!dma_desc) {
> - dev_err(tdc2dev(tdc), "dma_desc alloc failed\n");
> + if (!dma_desc)
>   return NULL;
> - }
>  
>   dma_async_tx_descriptor_init(&dma_desc->txd, &tdc->dma_chan);
>   dma_desc->txd.tx_submit = tegra_dma_tx_submit;
> @@ -340,8 +338,7 @@ static struct tegra_dma_sg_req *tegra_dma_sg_req_get(
>   spin_unlock_irqrestore(&tdc->lock, flags);
>  
>   sg_req = kzalloc(sizeof(struct tegra_dma_sg_req), GFP_NOWAIT);
> - if (!sg_req)
> - dev_err(tdc2dev(tdc), "sg_req alloc failed\n");
> +
>   return sg_req;
>  }
>  
> @@ -1319,10 +1316,8 @@ static int tegra_dma_probe(struct platform_device 
> *pdev)
>  
>   tdma = devm_kzalloc(&pdev->dev, sizeof(*tdma) + cdata->nr_channels *
>   sizeof(struct tegra_dma_channel), GFP_KERNEL);
> - if (!tdma) {
> - dev_err(&pdev->dev, "Error: memory allocation failed\n");
> + if (!tdma)
>   return -ENOMEM;
> - }
>  
>   tdma->dev = &pdev->dev;
>   tdma->chip_data = cdata;

For the tegra portion ...

Acked-by: Jon Hunter 

Cheers
Jon

-- 
nvpublic
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 7/8] dmaengine: tegra20-apb-dma: Only calculate residue if txstate exists.

2016-06-08 Thread Jon Hunter
Hi Peter,

On 07/06/16 18:38, Peter Griffin wrote:
> There is no point calculating the residue if there is
> no txstate to store the value.
> 
> Signed-off-by: Peter Griffin 
> ---
>  drivers/dma/tegra20-apb-dma.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
> index 01e316f..7f4af8c 100644
> --- a/drivers/dma/tegra20-apb-dma.c
> +++ b/drivers/dma/tegra20-apb-dma.c
> @@ -814,7 +814,7 @@ static enum dma_status tegra_dma_tx_status(struct 
> dma_chan *dc,
>   unsigned int residual;
>  
>   ret = dma_cookie_status(dc, cookie, txstate);
> - if (ret == DMA_COMPLETE)
> + if (ret == DMA_COMPLETE || !txstate)
>   return ret;

Thanks for reporting this. I agree that we should not do this, however, 
looking at the code for Tegra, I am wondering if this could change the
actual state that is returned. Looking at dma_cookie_status() it will
call dma_async_is_complete() which will return either DMA_COMPLETE or
DMA_IN_PROGRESS. It could be possible that the actual state for the
DMA transfer in the tegra driver is DMA_ERROR, so I am wondering if we
should do something like the following  ...

diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index 01e316f73559..45edab7418d0 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -822,13 +822,8 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan 
*dc,
/* Check on wait_ack desc status */
list_for_each_entry(dma_desc, &tdc->free_dma_desc, node) {
if (dma_desc->txd.cookie == cookie) {
-   residual =  dma_desc->bytes_requested -
-   (dma_desc->bytes_transferred %
-   dma_desc->bytes_requested);
-   dma_set_residue(txstate, residual);
ret = dma_desc->dma_status;
-   spin_unlock_irqrestore(&tdc->lock, flags);
-   return ret;
+   goto found;
}
}
 
@@ -836,17 +831,23 @@ static enum dma_status tegra_dma_tx_status(struct 
dma_chan *dc,
list_for_each_entry(sg_req, &tdc->pending_sg_req, node) {
dma_desc = sg_req->dma_desc;
if (dma_desc->txd.cookie == cookie) {
-   residual =  dma_desc->bytes_requested -
-   (dma_desc->bytes_transferred %
-   dma_desc->bytes_requested);
-   dma_set_residue(txstate, residual);
ret = dma_desc->dma_status;
-   spin_unlock_irqrestore(&tdc->lock, flags);
-   return ret;
+   goto found;
}
}
 
-   dev_dbg(tdc2dev(tdc), "cookie %d does not found\n", cookie);
+   dev_warn(tdc2dev(tdc), "cookie %d not found\n", cookie);
+   spin_unlock_irqrestore(&tdc->lock, flags);
+   return ret;
+
+found:
+   if (txstate) {
+   residual = dma_desc->bytes_requested -
+  (dma_desc->bytes_transferred %
+   dma_desc->bytes_requested);
+   dma_set_residue(txstate, residual);
+   }
+
spin_unlock_irqrestore(&tdc->lock, flags);
return ret;
 }

Cheers
Jon

-- 
nvpublic
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] drivers/net/fsl_ucc: Do not prefix header guard with CONFIG_

2016-06-08 Thread Andreas Ziegler
The CONFIG_ prefix should only be used for options which
can be configured through Kconfig and not for guarding headers.

Signed-off-by: Andreas Ziegler 
---
 drivers/net/wan/fsl_ucc_hdlc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.h b/drivers/net/wan/fsl_ucc_hdlc.h
index 525786a..881ecde 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.h
+++ b/drivers/net/wan/fsl_ucc_hdlc.h
@@ -8,8 +8,8 @@
  * option) any later version.
  */
 
-#ifndef CONFIG_UCC_HDLC_H
-#define CONFIG_UCC_HDLC_H
+#ifndef _UCC_HDLC_H_
+#define _UCC_HDLC_H_
 
 #include 
 #include 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/nvram: remove unused pstore headers

2016-06-08 Thread Geliang Tang
Since the pstore code has moved away from nvram.c, remove unused
pstore headers pstore.h and kmsg_dump.h.

Signed-off-by: Geliang Tang 
---
 arch/powerpc/platforms/pseries/nvram.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/nvram.c 
b/arch/powerpc/platforms/pseries/nvram.c
index 9f818417..79aef8c 100644
--- a/arch/powerpc/platforms/pseries/nvram.c
+++ b/arch/powerpc/platforms/pseries/nvram.c
@@ -17,8 +17,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Christian Zigotzky

Hi Michael,

On 08 June 2016 at 04:52 AM, Michael Ellerman wrote:

On Tue, 2016-06-07 at 22:17 +0200, Christian Zigotzky wrote:

764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
Author: Aneesh Kumar K.V
Date:   Fri Apr 29 23:26:09 2016 +1000

  powerpc/mm/radix: Add checks in slice code to catch radix usage

  Radix doesn't need slice support. Catch incorrect usage of slice code
  when radix is enabled.

  Signed-off-by: Aneesh Kumar K.V
  Signed-off-by: Michael Ellerman


Hmm, I find that hard to believe. But maybe I'm missing something.

Can you checkout Linus' master and then revert that commit?

cheers


$ git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git


$ git checkout
Your branch is up-to-date with 'origin/master'.

$ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764 -m 1
error: Mainline was specified but commit 
764041e0f43cc7846f6d8eb246d65b53cc06c764 is not a merge.

fatal: revert failed

How can I checkout Linus' master and then revert that commit?

Cheers,

Christian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/8] dmaengine: ste_dma40: Only calculate residue if txstate exists.

2016-06-08 Thread Linus Walleij
On Tue, Jun 7, 2016 at 7:38 PM, Peter Griffin  wrote:

> There is no point calculating the residue if there is
> no txstate to store the value.
>
> Signed-off-by: Peter Griffin 

Acked-by: Linus Walleij 

Yours,
Linus Walleij
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()

2016-06-08 Thread Peter Zijlstra
On Wed, Jun 08, 2016 at 09:20:45PM +1000, Michael Ellerman wrote:
> On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote:
> > On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote:
> > > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote:
> > > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote:
> > > > 
> > > > Why the move to in-line this implementation? It looks like a fairly big
> > > > function.
> > > 
> > > I agree it's not pretty.
> > 
> > > I'm not beholden to v3 though if you hate it.
> > 
> > I don't mind; its just that I am in a similar boat with qspinlock and
> > chose the other option. So I just figured I'd ask :-)
> 
> OK. I'll go with inline and we'll see which version gets "cleaned-up" by a
> janitor first ;)

Ok; what tree does this go in? I have this dependent series which I'd
like to get sorted and merged somewhere.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Christian Zigotzky

Hi Michael,

Thanks a lot for the hint. I compiled it without the commit below but 
unfortunately it doesn't boot.


Cheers,

Christian

On 08 June 2016 at 1:30 PM, Michael Ellerman wrote:


It's not a merge, so just plain git revert:

   $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
linux-git
   $ cd linux-git
   $ git revert 764041e0f43cc7846f6d8eb246d65b53cc06c764
   [master 5dd9737a173e] Revert "powerpc/mm/radix: Add checks in slice code to catch 
radix usage"
1 file changed, 16 deletions(-)

cheers




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/nohash: Fix build break with 4K pages

2016-06-08 Thread Michael Ellerman
Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are
allocating fragments" renamed page_table_free() to pte_fragment_free().
One occurrence was mistyped as pte_fragment_fre().

This only breaks the nohash 4K page build, which is not the default or
enabled in any defconfig.

Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are allocating 
fragments")
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h 
b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index 0c12a3bfe2ab..069369f6414b 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -172,7 +172,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
 
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
-   pte_fragment_fre((unsigned long *)pte, 1);
+   pte_fragment_free((unsigned long *)pte, 1);
 }
 
 static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Christian Zigotzky

Hi Darren,

Many thanks for your help. I started my bisect with the following commits:

git bisect start

git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e

git bisect bad 1a695a905c18548062509178b98bc91e67510864 (Linux 4.7-rc1)

Did you start your bisect with the same bad and good commit?

I will revert your bad commit and compile a new test kernel.

Thanks,

Christian

On 08 June 2016 at 1:33 PM, Darren Stevens wrote:

Hello Christian

That's not where I ended up with my bisect, this commit is about 10 before the
one I found to be bad, which is:

commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:21 2016 +1000

 powerpc/mm: vmalloc abstraction in preparation for radix
 
 The vmalloc range differs between hash and radix config. Hence make

 VMALLOC_START and related constants a variable which will be runtime
 initialized depending on whether hash or radix mode is active.
 
 Signed-off-by: Aneesh Kumar K.V 

 [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
 Signed-off-by: Michael Ellerman 

Not sure how we are getting different results though. I have attached my
bisect log and the suspect commit, whcih is quite large. I'm not sure which
part of it is at fault. I have some jobs to do now, but hope to get tesing
this later today.

Regards
Darren



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Darren Stevens
Hello Christian

On 07/06/2016, Christian Zigotzky wrote:
> "range.size, pgprot_val(pgprot_noncached(__pgprot(0;" isn't the 
> problem. :-) It works.
>
> 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
> commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
> Author: Aneesh Kumar K.V 
> Date:   Fri Apr 29 23:26:09 2016 +1000
>
>  powerpc/mm/radix: Add checks in slice code to catch radix usage
>
>  Radix doesn't need slice support. Catch incorrect usage of slice
>  code when radix is enabled.
>
>  Signed-off-by: Aneesh Kumar K.V 
>  Signed-off-by: Michael Ellerman 
>

That's not where I ended up with my bisect, this commit is about 10 before the
one I found to be bad, which is:

commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:21 2016 +1000

powerpc/mm: vmalloc abstraction in preparation for radix

The vmalloc range differs between hash and radix config. Hence make
VMALLOC_START and related constants a variable which will be runtime
initialized depending on whether hash or radix mode is active.

Signed-off-by: Aneesh Kumar K.V 
[mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
Signed-off-by: Michael Ellerman 

Not sure how we are getting different results though. I have attached my
bisect log and the suspect commit, whcih is quite large. I'm not sure which
part of it is at fault. I have some jobs to do now, but hope to get tesing
this later today.

Regards
Darren
git bisect start
# bad: [ed2608faa0f701b1dbc65277a9e5c7ff7118bfd4] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect bad ed2608faa0f701b1dbc65277a9e5c7ff7118bfd4
# good: [8ffb4103f5e28d7e7890ed4774d8e009f253f56e] IB/qib: Use cache inhibitted 
and guarded mapping on powerpc
git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e
# good: [801faf0db8947e01877920e848a4d338dd7a99e7] mm/slab: lockless decision 
to grow cache
git bisect good 801faf0db8947e01877920e848a4d338dd7a99e7
# bad: [2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac] Merge tag 'staging-4.7-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac
# bad: [be1332c0994fbf016fa4ef0f0c4acda566fe6cb3] Merge tag 'gfs2-4.7.fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
git bisect bad be1332c0994fbf016fa4ef0f0c4acda566fe6cb3
# good: [f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752] Merge tag 'sound-4.7-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752
# good: [a1c28b75a95808161cacbb3531c418abe248994e] Merge branch 'for-linus' of 
git://git.armlinux.org.uk/~rmk/linux-arm
git bisect good a1c28b75a95808161cacbb3531c418abe248994e
# bad: [6eb59af580dcffc6f6982ac8ef6d27a1a5f26b27] Merge tag 'mfd-for-linus-4.7' 
of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
git bisect bad 6eb59af580dcffc6f6982ac8ef6d27a1a5f26b27
# bad: [4fad494321351f0ac412945c6a464109ad96734a] powerpc/powernv: Simplify 
pnv_eeh_reset()
git bisect bad 4fad494321351f0ac412945c6a464109ad96734a
# bad: [43a5c684270ee9b5b13c91ec048831dd5b7e0cdc] powerpc/mm/radix: Make sure 
swapper pgdir is properly aligned
git bisect bad 43a5c684270ee9b5b13c91ec048831dd5b7e0cdc
# good: [a9252aaefe7e72133e7a37e0eff4e950a4f33af1] powerpc/mm: Move hugetlb and 
THP related pmd accessors to pgtable.h
git bisect good a9252aaefe7e72133e7a37e0eff4e950a4f33af1
# good: [177ba7c647f37bc3f31667192059ee794347d79d] powerpc/mm/radix: Limit paca 
allocation in radix
git bisect good 177ba7c647f37bc3f31667192059ee794347d79d
# good: [934828edfadc43be07e53429ce501741bedf4a5e] powerpc/mm: Make 4K and 64K 
use pte_t for pgtable_t
git bisect good 934828edfadc43be07e53429ce501741bedf4a5e
# bad: [a3dece6d69b0ad21b64104dff508c67a1a1f14dd] powerpc/radix: Update MMU 
cache
git bisect bad a3dece6d69b0ad21b64104dff508c67a1a1f14dd
# good: [4dfb88ca9b66690d21030ccacc1cca73db90655e] powerpc/mm: Update pte 
filter for radix
git bisect good 4dfb88ca9b66690d21030ccacc1cca73db90655e
# bad: [d6a9996e84ac4beb7713e9485f4563e100a9b03e] powerpc/mm: vmalloc 
abstraction in preparation for radix
git bisect bad d6a9996e84ac4beb7713e9485f4563e100a9b03e
commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:21 2016 +1000

powerpc/mm: vmalloc abstraction in preparation for radix

The vmalloc range differs between hash and radix config. Hence make
VMALLOC_START and related constants a variable which will be runtime
initialized depending on whether hash or radix mode is active.

Signed-off-by: Aneesh Kumar K.V 
[mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
Signed-off-by: Michael Ellerman 

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index cd3e915..f61cad3 100644
--- a/arch/powerpc/include/asm/book3s/64/hash

Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 14:35 +0200, Peter Zijlstra wrote:
> On Wed, Jun 08, 2016 at 09:20:45PM +1000, Michael Ellerman wrote:
> > On Mon, 2016-06-06 at 16:46 +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 06, 2016 at 10:17:25PM +1000, Michael Ellerman wrote:
> > > > On Mon, 2016-06-06 at 13:56 +0200, Peter Zijlstra wrote:
> > > > > On Mon, Jun 06, 2016 at 09:42:20PM +1000, Michael Ellerman wrote:
> > > > > 
> > > > > Why the move to in-line this implementation? It looks like a fairly 
> > > > > big
> > > > > function.
> > > > 
> > > > I agree it's not pretty.
> > > 
> > > > I'm not beholden to v3 though if you hate it.
> > > 
> > > I don't mind; its just that I am in a similar boat with qspinlock and
> > > chose the other option. So I just figured I'd ask :-)
> > 
> > OK. I'll go with inline and we'll see which version gets "cleaned-up" by a
> > janitor first ;)
> 
> Ok; what tree does this go in? I have this dependent series which I'd
> like to get sorted and merged somewhere.

Ah sorry, I didn't realise. I was going to put it in my next (which doesn't
exist yet but hopefully will early next week).

I'll make a topic branch with just that commit based on rc2 or rc3?

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Christian Zigotzky

Hi All,

I tried to revert this commit but unfortunately I doesn't work:

git revert d6a9996e84ac4beb7713e9485f4563e100a9b03e

error: could not revert d6a9996... powerpc/mm: vmalloc abstraction in 
preparation for radix

hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add ' or 'git rm '
hint: and commit the result with 'git commit'


Any hints?

Thanks,

Christian

On 08 June 2016 at 1:33 PM, Darren Stevens wrote:

Hello Christian

That's not where I ended up with my bisect, this commit is about 10 before the
one I found to be bad, which is:

commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:21 2016 +1000

 powerpc/mm: vmalloc abstraction in preparation for radix
 
 The vmalloc range differs between hash and radix config. Hence make

 VMALLOC_START and related constants a variable which will be runtime
 initialized depending on whether hash or radix mode is active.
 
 Signed-off-by: Aneesh Kumar K.V 

 [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
 Signed-off-by: Michael Ellerman 

Not sure how we are getting different results though. I have attached my
bisect log and the suspect commit, whcih is quite large. I'm not sure which
part of it is at fault. I have some jobs to do now, but hope to get tesing
this later today.

Regards
Darren



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 12:33 +0100, Darren Stevens wrote:
> On 07/06/2016, Christian Zigotzky wrote:
> > 
> > 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
> > commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
> > Author: Aneesh Kumar K.V 
> > Date:   Fri Apr 29 23:26:09 2016 +1000
> > 
> >  powerpc/mm/radix: Add checks in slice code to catch radix usage
> > 
> 
> That's not where I ended up with my bisect, this commit is about 10 before the
> one I found to be bad, which is:
> 
> commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
> Author: Aneesh Kumar K.V 
> Date:   Fri Apr 29 23:26:21 2016 +1000
> 
> powerpc/mm: vmalloc abstraction in preparation for radix
> 
> The vmalloc range differs between hash and radix config. Hence make
> VMALLOC_START and related constants a variable which will be runtime
> initialized depending on whether hash or radix mode is active.
> 
> Signed-off-by: Aneesh Kumar K.V 
> [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
> Signed-off-by: Michael Ellerman 
> 
> Not sure how we are getting different results though. I have attached my
> bisect log and the suspect commit, whcih is quite large. I'm not sure which
> part of it is at fault. I have some jobs to do now, but hope to get tesing
> this later today.

That one is more likely to be the problem, though I don't see anything glaringly
wrong with it.

Does your patch use any of the constants that are changed in that file? They now
aren't constants, they're initialised at boot, so if you use them too early
you'll get junk.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()

2016-06-08 Thread Peter Zijlstra
On Wed, Jun 08, 2016 at 11:49:20PM +1000, Michael Ellerman wrote:

> > Ok; what tree does this go in? I have this dependent series which I'd
> > like to get sorted and merged somewhere.
> 
> Ah sorry, I didn't realise. I was going to put it in my next (which doesn't
> exist yet but hopefully will early next week).
> 
> I'll make a topic branch with just that commit based on rc2 or rc3?

Works for me; thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 01/10] Fix .long's in mm/tlb-radix.c to more meaningful

2016-06-08 Thread Aneesh Kumar K.V
From: Balbir Singh 

The .longs with the shifts are harder to read, use more
meaningful names for the opcodes. PPC_TLBIE_5 is introduced
for the 5 opcode variation of the instruction due to an existing
op-code for the 2 opcode variant

Signed-off-by: Balbir Singh 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/ppc-opcode.h | 14 ++
 arch/powerpc/mm/tlb-radix.c   | 13 +
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1d035c1cc889..c0e9ea44fee3 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -184,6 +184,7 @@
 #define PPC_INST_STSWX 0x7c00052a
 #define PPC_INST_STXVD2X   0x7c000798
 #define PPC_INST_TLBIE 0x7c000264
+#define PPC_INST_TLBIEL0x7c000224
 #define PPC_INST_TLBILX0x7c24
 #define PPC_INST_WAIT  0x7c7c
 #define PPC_INST_TLBIVAX   0x7c000624
@@ -257,6 +258,9 @@
 #define ___PPC_RB(b)   (((b) & 0x1f) << 11)
 #define ___PPC_RS(s)   (((s) & 0x1f) << 21)
 #define ___PPC_RT(t)   ___PPC_RS(t)
+#define ___PPC_R(r)(((r) & 0x1) << 16)
+#define ___PPC_PRS(prs)(((prs) & 0x1) << 17)
+#define ___PPC_RIC(ric)(((ric) & 0x3) << 18)
 #define __PPC_RA(a)___PPC_RA(__REG_##a)
 #define __PPC_RA0(a)   ___PPC_RA(__REGA0_##a)
 #define __PPC_RB(b)___PPC_RB(__REG_##b)
@@ -321,6 +325,16 @@
__PPC_WC(w))
 #define PPC_TLBIE(lp,a)stringify_in_c(.long PPC_INST_TLBIE | \
   ___PPC_RB(a) | ___PPC_RS(lp))
+#definePPC_TLBIE_5(rb,rs,ric,prs,r) \
+   stringify_in_c(.long PPC_INST_TLBIE | \
+   ___PPC_RB(rb) | ___PPC_RS(rs) | \
+   ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
+   ___PPC_R(r))
+#definePPC_TLBIEL(rb,rs,ric,prs,r) \
+   stringify_in_c(.long PPC_INST_TLBIEL | \
+   ___PPC_RB(rb) | ___PPC_RS(rs) | \
+   ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
+   ___PPC_R(r))
 #define PPC_TLBSRX_DOT(a,b)stringify_in_c(.long PPC_INST_TLBSRX_DOT | \
__PPC_RA0(a) | __PPC_RB(b))
 #define PPC_TLBIVAX(a,b)   stringify_in_c(.long PPC_INST_TLBIVAX | \
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 0fdaf93a3e09..e6b7487ad28f 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,8 +31,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set)
ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("ptesync": : :"memory");
 }
@@ -60,8 +60,7 @@ static inline void _tlbie_pid(unsigned long pid)
ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("eieio; tlbsync; ptesync": : :"memory");
 }
@@ -79,8 +78,7 @@ static inline void _tlbiel_va(unsigned long va, unsigned long 
pid,
ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("ptesync": : :"memory");
 }
@@ -98,8 +96,7 @@ static inline void _tlbie_va(unsigned long va, unsigned long 
pid,
ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("eieio; tlbsync; ptesync": : :"memory");
 }
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/

[PATCH V2 00/10] Fixes for Radix support

2016-06-08 Thread Aneesh Kumar K.V
Hi Michael,

This series includes patches I had posted before. I collected them in a series
and marked the series V2. This address the review feedback I received from
last post.


Aneesh Kumar K.V (9):
  powerpc/mm/radix: Update to tlb functions ric argument
  powerpc/mm/radix: Flush page walk cache when freeing page table
  powerpc/mm/radix: Update LPCR HR bit as per ISA
  powerpc/mm: use _raw variant of page table accessors
  powerpc/mm: Compile out radix related functions if RADIX_MMU is
disabled
  powerpc/hash: Use the correct ppp mask when updating hpte
  powerpc/mm: Clear top 16 bits of va only on older cpus
  powerpc/mm: Print formation regarding the the MMU mode
  powerpc/mm/hash: Update SDR1 size encoding as documented in ISA 3.0

Balbir Singh (1):
  Fix .long's in mm/tlb-radix.c to more meaningful

 arch/powerpc/include/asm/book3s/32/pgalloc.h   |  1 -
 arch/powerpc/include/asm/book3s/64/mmu-hash.h  |  1 +
 arch/powerpc/include/asm/book3s/64/mmu.h   |  5 ++
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 16 +++-
 arch/powerpc/include/asm/book3s/64/pgtable-4k.h|  6 +-
 arch/powerpc/include/asm/book3s/64/pgtable-64k.h   |  6 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h   | 99 +++---
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  3 +
 arch/powerpc/include/asm/book3s/64/tlbflush.h  | 15 +++-
 arch/powerpc/include/asm/book3s/pgalloc.h  |  5 --
 arch/powerpc/include/asm/mmu.h | 12 ++-
 arch/powerpc/include/asm/pgtable-be-types.h| 15 
 arch/powerpc/include/asm/ppc-opcode.h  | 14 +++
 arch/powerpc/include/asm/reg.h |  1 +
 arch/powerpc/mm/hash_native_64.c   | 14 +--
 arch/powerpc/mm/hash_utils_64.c| 12 +--
 arch/powerpc/mm/pgtable-radix.c|  7 +-
 arch/powerpc/mm/tlb-radix.c| 94 +---
 18 files changed, 236 insertions(+), 90 deletions(-)

-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 02/10] powerpc/mm/radix: Update to tlb functions ric argument

2016-06-08 Thread Aneesh Kumar K.V
Radix invalidate control (RIC) is used to control which cache to flush
using tlb instructions. When doing a PID flush, we currently flush
everything including page walk cache. For address range flush, we flush
only the TLB. In the later patch, we add support for flushing only
page walk cache.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/tlb-radix.c | 43 ++-
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index e6b7487ad28f..b33b7c77cfa3 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -19,16 +19,20 @@
 
 static DEFINE_RAW_SPINLOCK(native_tlbie_lock);
 
-static inline void __tlbiel_pid(unsigned long pid, int set)
+#define RIC_FLUSH_TLB 0
+#define RIC_FLUSH_PWC 1
+#define RIC_FLUSH_ALL 2
+
+static inline void __tlbiel_pid(unsigned long pid, int set,
+   unsigned long ric)
 {
-   unsigned long rb,rs,ric,prs,r;
+   unsigned long rb,rs,prs,r;
 
rb = PPC_BIT(53); /* IS = 1 */
rb |= set << PPC_BITLSHIFT(51);
rs = ((unsigned long)pid) << PPC_BITLSHIFT(31);
prs = 1; /* process scoped */
r = 1;   /* raidx format */
-   ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
@@ -39,25 +43,24 @@ static inline void __tlbiel_pid(unsigned long pid, int set)
 /*
  * We use 128 set in radix mode and 256 set in hpt mode.
  */
-static inline void _tlbiel_pid(unsigned long pid)
+static inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
 {
int set;
 
for (set = 0; set < POWER9_TLB_SETS_RADIX ; set++) {
-   __tlbiel_pid(pid, set);
+   __tlbiel_pid(pid, set, ric);
}
return;
 }
 
-static inline void _tlbie_pid(unsigned long pid)
+static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
 {
-   unsigned long rb,rs,ric,prs,r;
+   unsigned long rb,rs,prs,r;
 
rb = PPC_BIT(53); /* IS = 1 */
rs = pid << PPC_BITLSHIFT(31);
prs = 1; /* process scoped */
r = 1;   /* raidx format */
-   ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
@@ -66,16 +69,15 @@ static inline void _tlbie_pid(unsigned long pid)
 }
 
 static inline void _tlbiel_va(unsigned long va, unsigned long pid,
- unsigned long ap)
+ unsigned long ap, unsigned long ric)
 {
-   unsigned long rb,rs,ric,prs,r;
+   unsigned long rb,rs,prs,r;
 
rb = va & ~(PPC_BITMASK(52, 63));
rb |= ap << PPC_BITLSHIFT(58);
rs = pid << PPC_BITLSHIFT(31);
prs = 1; /* process scoped */
r = 1;   /* raidx format */
-   ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
@@ -84,16 +86,15 @@ static inline void _tlbiel_va(unsigned long va, unsigned 
long pid,
 }
 
 static inline void _tlbie_va(unsigned long va, unsigned long pid,
-unsigned long ap)
+unsigned long ap, unsigned long ric)
 {
-   unsigned long rb,rs,ric,prs,r;
+   unsigned long rb,rs,prs,r;
 
rb = va & ~(PPC_BITMASK(52, 63));
rb |= ap << PPC_BITLSHIFT(58);
rs = pid << PPC_BITLSHIFT(31);
prs = 1; /* process scoped */
r = 1;   /* raidx format */
-   ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
@@ -119,7 +120,7 @@ void radix__local_flush_tlb_mm(struct mm_struct *mm)
preempt_disable();
pid = mm->context.id;
if (pid != MMU_NO_CONTEXT)
-   _tlbiel_pid(pid);
+   _tlbiel_pid(pid, RIC_FLUSH_ALL);
preempt_enable();
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_mm);
@@ -132,7 +133,7 @@ void radix___local_flush_tlb_page(struct mm_struct *mm, 
unsigned long vmaddr,
preempt_disable();
pid = mm ? mm->context.id : 0;
if (pid != MMU_NO_CONTEXT)
-   _tlbiel_va(vmaddr, pid, ap);
+   _tlbiel_va(vmaddr, pid, ap, RIC_FLUSH_TLB);
preempt_enable();
 }
 
@@ -169,11 +170,11 @@ void radix__flush_tlb_mm(struct mm_struct *mm)
 
if (lock_tlbie)
raw_spin_lock(&native_tlbie_lock);
-   _tlbie_pid(pid);
+   _tlbie_pid(pid, RIC_FLUSH_ALL);
if (lock_tlbie)
raw_spin_unlock(&native_tlbie_lock);
} else
-   _tlbiel_pid(pid);
+   _tlbiel_pid(pid, RIC_FLUSH_ALL);
 no_context:
preempt_enable();
 }
@@ -193,11 +194,11 @@ void radix___flush_tlb_page(struct mm_struct *mm, 
unsigned long vmaddr,
 
if (lock_tlbie)
  

[PATCH V2 03/10] powerpc/mm/radix: Flush page walk cache when freeing page table

2016-06-08 Thread Aneesh Kumar K.V
Even though a tlb_flush() does a flush with invalidate all cache,
we can end up doing an RCU page table free before calling tlb_flush().
That means we can have page walk cache entries even after we free the
page table pages. This can result in us doing wrong page table walk.

Avoid this by doing pwc flush on every page table free. We can't batch
the pwc flush, because the rcu call back function where we free the
page table pages doesn't have information of the mmu gather. Thus we
have to do a pwc on every page table page freed.

Note: I also removed the dummy tlb_flush_pgtable call functions for
hash 32.
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgalloc.h   |  1 -
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 16 -
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  3 ++
 arch/powerpc/include/asm/book3s/64/tlbflush.h  | 15 -
 arch/powerpc/include/asm/book3s/pgalloc.h  |  5 ---
 arch/powerpc/mm/tlb-radix.c| 38 ++
 6 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
b/arch/powerpc/include/asm/book3s/32/pgalloc.h
index a2350194fc76..8e21bb492dca 100644
--- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
@@ -102,7 +102,6 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
  unsigned long address)
 {
-   tlb_flush_pgtable(tlb, address);
pgtable_page_dtor(table);
pgtable_free_tlb(tlb, page_address(table), 0);
 }
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 488279edb1f0..26eb2cb80c4e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -110,6 +110,11 @@ static inline void pud_populate(struct mm_struct *mm, 
pud_t *pud, pmd_t *pmd)
 static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
   unsigned long address)
 {
+   /*
+* By now all the pud entries should be none entries. So go
+* ahead and flush the page walk cache
+*/
+   flush_tlb_pgtable(tlb, address);
 pgtable_free_tlb(tlb, pud, PUD_INDEX_SIZE);
 }
 
@@ -127,6 +132,11 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t 
*pmd)
 static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
   unsigned long address)
 {
+   /*
+* By now all the pud entries should be none entries. So go
+* ahead and flush the page walk cache
+*/
+   flush_tlb_pgtable(tlb, address);
 return pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX);
 }
 
@@ -198,7 +208,11 @@ static inline void pte_free(struct mm_struct *mm, 
pgtable_t ptepage)
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
  unsigned long address)
 {
-   tlb_flush_pgtable(tlb, address);
+   /*
+* By now all the pud entries should be none entries. So go
+* ahead and flush the page walk cache
+*/
+   flush_tlb_pgtable(tlb, address);
pgtable_free_tlb(tlb, table, 0);
 }
 
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 13ef38828dfe..3fa94fcac628 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -18,16 +18,19 @@ extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
 extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned 
long vmaddr);
 extern void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long 
vmaddr,
unsigned long ap, int nid);
+extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long 
addr);
 extern void radix__tlb_flush(struct mmu_gather *tlb);
 #ifdef CONFIG_SMP
 extern void radix__flush_tlb_mm(struct mm_struct *mm);
 extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr);
 extern void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr,
  unsigned long ap, int nid);
+extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr);
 #else
 #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
 #define radix___flush_tlb_page(mm,addr,p,i)
radix___local_flush_tlb_page(mm,addr,p,i)
+#define radix__flush_tlb_pwc(tlb, addr)radix__local_flush_tlb_pwc(tlb, 
addr)
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index d98424ae356c..541cf809e38e 100644

[PATCH V2 04/10] powerpc/mm/radix: Update LPCR HR bit as per ISA

2016-06-08 Thread Aneesh Kumar K.V
PowerISA 3.0 requires the MMU mode (radix vs. hash) of the hypervisor
to be mirrored in the LPCR register, in addition to the partition table.
This is done to avoid fetching from the table when deciding, among other
things, how to perform transitions to HV mode on some interrupts.
So let's set it up appropriately

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/reg.h  | 1 +
 arch/powerpc/mm/pgtable-radix.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a0948f40bc7b..466816ede138 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -348,6 +348,7 @@
 #define   LPCR_RMI 0x0002  /* real mode is cache inhibit */
 #define   LPCR_HDICE   0x0001  /* Hyp Decr enable (HV,PR,EE) */
 #define   LPCR_UPRT0x0040  /* Use Process Table (ISA 3) */
+#define  LPCR_HR  0x0010
 #ifndef SPRN_LPID
 #define SPRN_LPID  0x13F   /* Logical Partition Identifier */
 #endif
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index c939e6e57a9e..73aa402047ef 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -340,7 +340,7 @@ void __init radix__early_init_mmu(void)
radix_init_page_sizes();
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
lpcr = mfspr(SPRN_LPCR);
-   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT);
+   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
radix_init_partition_table();
}
 
@@ -355,7 +355,7 @@ void radix__early_init_mmu_secondary(void)
 */
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
lpcr = mfspr(SPRN_LPCR);
-   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT);
+   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
 
mtspr(SPRN_PTCR,
  __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 05/10] powerpc/mm: use _raw variant of page table accessors

2016-06-08 Thread Aneesh Kumar K.V
This switch few of the page table accessor to use the __raw variant
and does the cpu to big endian conversion of constants. This helps in
generating better code.

For ex: a pgd_none(pgd) check with and without fix is listed below

Without fix:

   2240:20 00 61 eb ld  r27,32(r1)
/* PGD level */
typedef struct { __be64 pgd; } pgd_t;
static inline unsigned long pgd_val(pgd_t x)
{
return be64_to_cpu(x.pgd);

2244:   22 00 66 78 rldicl  r6,r3,32,32
2248:   3e 40 7d 54 rotlwi  r29,r3,8
224c:   0e c0 7d 50 rlwimi  r29,r3,24,0,7
2250:   3e 40 c5 54 rotlwi  r5,r6,8
2254:   2e c4 7d 50 rlwimi  r29,r3,24,16,23
2258:   0e c0 c5 50 rlwimi  r5,r6,24,0,7
225c:   2e c4 c5 50 rlwimi  r5,r6,24,16,23
2260:   c6 07 bd 7b rldicr  r29,r29,32,31
2264:   78 2b bd 7f or  r29,r29,r5
if (pgd_none(pgd))
2268:   00 00 bd 2f cmpdi   cr7,r29,0
226c:   54 03 9e 41 beq cr7,25c0 <__get_user_pages_fast+0x500>

With fix:
-
2370:   20 00 61 eb ld  r27,32(r1)
if (pgd_none(pgd))
2374:   00 00 bd 2f cmpdi   cr7,r29,0
2378:   a8 03 9e 41 beq cr7,2720 <__get_user_pages_fast+0x530>
break;
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable-4k.h  |  6 +-
 arch/powerpc/include/asm/book3s/64/pgtable-64k.h |  6 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 99 +---
 arch/powerpc/include/asm/pgtable-be-types.h  | 15 
 4 files changed, 91 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h 
b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
index 71e9abced493..9db83b4e017d 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
@@ -11,7 +11,7 @@ static inline int pmd_huge(pmd_t pmd)
 * leaf pte for huge page
 */
if (radix_enabled())
-   return !!(pmd_val(pmd) & _PAGE_PTE);
+   return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
return 0;
 }
 
@@ -21,7 +21,7 @@ static inline int pud_huge(pud_t pud)
 * leaf pte for huge page
 */
if (radix_enabled())
-   return !!(pud_val(pud) & _PAGE_PTE);
+   return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
return 0;
 }
 
@@ -31,7 +31,7 @@ static inline int pgd_huge(pgd_t pgd)
 * leaf pte for huge page
 */
if (radix_enabled())
-   return !!(pgd_val(pgd) & _PAGE_PTE);
+   return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
return 0;
 }
 #define pgd_huge pgd_huge
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h 
b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
index cb2d0a5fa3f8..0d2845b44763 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
@@ -15,7 +15,7 @@ static inline int pmd_huge(pmd_t pmd)
/*
 * leaf pte for huge page
 */
-   return !!(pmd_val(pmd) & _PAGE_PTE);
+   return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
 }
 
 static inline int pud_huge(pud_t pud)
@@ -23,7 +23,7 @@ static inline int pud_huge(pud_t pud)
/*
 * leaf pte for huge page
 */
-   return !!(pud_val(pud) & _PAGE_PTE);
+   return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
 }
 
 static inline int pgd_huge(pgd_t pgd)
@@ -31,7 +31,7 @@ static inline int pgd_huge(pgd_t pgd)
/*
 * leaf pte for huge page
 */
-   return !!(pgd_val(pgd) & _PAGE_PTE);
+   return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
 }
 #define pgd_huge pgd_huge
 
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 88a5ecaa157b..d3ab97e3c744 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -317,7 +317,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 {
unsigned long old;
 
-   if ((pte_val(*ptep) & (_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 0)
+   if ((pte_raw(*ptep) & cpu_to_be64(_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 
0)
return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
@@ -335,8 +335,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
 {
-
-   if ((pte_val(*ptep) & _PAGE_WRITE) == 0)
+   if ((pte_raw(*ptep) & cpu_to_be64(_PAGE_WRITE)) == 0)
return;
 
pte_update(mm, addr, ptep, _PAGE_WRITE, 0, 0);
@@ -345,7 +344,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, 
unsigned long

[PATCH V2 06/10] powerpc/mm: Compile out radix related functions if RADIX_MMU is disabled

2016-06-08 Thread Aneesh Kumar K.V
Currently we depend on mmu_has_feature to evalute to zero based on
MMU_FTRS_POSSIBLE mask. In a later patch, we want to update
radix_enabled() to runtime update the conditional operation to a jump
instruction. This implies we cannot depend on MMU_FTRS_POSSIBLE mask.
Instead define radix_enabled to return 0 if RADIX_MMU is not enabled.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/mmu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 5854263d4d6e..d4eda6420523 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -23,7 +23,12 @@ struct mmu_psize_def {
 };
 extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
+#ifdef CONFIG_PPC_RADIX_MMU
 #define radix_enabled() mmu_has_feature(MMU_FTR_RADIX)
+#else
+#define radix_enabled() (0)
+#endif
+
 
 #endif /* __ASSEMBLY__ */
 
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 07/10] powerpc/hash: Use the correct ppp mask when updating hpte

2016-06-08 Thread Aneesh Kumar K.V
With commit: e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO") we
now use all the three PPP bits. The top bit is now used to have a
PPP value of 0b110 which will be mapped to kernel read only. When
updating the hpte entry use right mask such that we update the
63rd bit (top 'P' bit) too.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 +
 arch/powerpc/mm/hash_native_64.c  | 8 
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 290157e8d5b2..74839f24f412 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -88,6 +88,7 @@
 #define HPTE_R_RPN_SHIFT   12
 #define HPTE_R_RPN ASM_CONST(0x0000)
 #define HPTE_R_PP  ASM_CONST(0x0003)
+#define HPTE_R_PPP ASM_CONST(0x8003)
 #define HPTE_R_N   ASM_CONST(0x0004)
 #define HPTE_R_G   ASM_CONST(0x0008)
 #define HPTE_R_M   ASM_CONST(0x0010)
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index d873f6507f72..e37916cbc18d 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -316,8 +316,8 @@ static long native_hpte_updatepp(unsigned long slot, 
unsigned long newpp,
DBG_LOW(" -> hit\n");
/* Update the HPTE */
hptep->r = cpu_to_be64((be64_to_cpu(hptep->r) &
-   ~(HPTE_R_PP | HPTE_R_N)) |
-  (newpp & (HPTE_R_PP | HPTE_R_N |
+   ~(HPTE_R_PPP | HPTE_R_N)) |
+  (newpp & (HPTE_R_PPP | HPTE_R_N |
 HPTE_R_C)));
}
native_unlock_hpte(hptep);
@@ -385,8 +385,8 @@ static void native_hpte_updateboltedpp(unsigned long newpp, 
unsigned long ea,
 
/* Update the HPTE */
hptep->r = cpu_to_be64((be64_to_cpu(hptep->r) &
-   ~(HPTE_R_PP | HPTE_R_N)) |
-   (newpp & (HPTE_R_PP | HPTE_R_N)));
+   ~(HPTE_R_PPP | HPTE_R_N)) |
+  (newpp & (HPTE_R_PPP | HPTE_R_N)));
/*
 * Ensure it is out of the tlb too. Bolted entries base and
 * actual page size will be same.
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 08/10] powerpc/mm: Clear top 16 bits of va only on older cpus

2016-06-08 Thread Aneesh Kumar K.V
As per ISA, we need to do this only for architecture version 2.02 and
earlier. This continued to work even for 2.07. But let's not do this for
anything after 2.02

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/mmu.h   | 12 +---
 arch/powerpc/mm/hash_native_64.c |  6 --
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index e53ebebff474..616575fcbcc7 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -24,6 +24,11 @@
 /*
  * This is individual features
  */
+/*
+ * We need to clear top 16bits of va (from the remaining 64 bits )in
+ * tlbie* instructions
+ */
+#define MMU_FTR_TLBIE_CROP_VA  ASM_CONST(0x8000)
 
 /* Enable use of high BAT registers */
 #define MMU_FTR_USE_HIGH_BATS  ASM_CONST(0x0001)
@@ -96,8 +101,9 @@
 /* MMU feature bit sets for various CPUs */
 #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2  \
MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2
-#define MMU_FTRS_POWER4MMU_FTRS_DEFAULT_HPTE_ARCH_V2
-#define MMU_FTRS_PPC970MMU_FTRS_POWER4
+#define MMU_FTRS_POWER4MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
+   MMU_FTR_TLBIE_CROP_VA
+#define MMU_FTRS_PPC970MMU_FTRS_POWER4 | MMU_FTR_TLBIE_CROP_VA
 #define MMU_FTRS_POWER5MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
 #define MMU_FTRS_POWER6MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
 #define MMU_FTRS_POWER7MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
@@ -124,7 +130,7 @@ enum {
MMU_FTR_USE_TLBRSRV | MMU_FTR_USE_PAIRED_MAS |
MMU_FTR_NO_SLBIE_B | MMU_FTR_16M_PAGE | MMU_FTR_TLBIEL |
MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE |
-   MMU_FTR_1T_SEGMENT |
+   MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA |
 #ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_RADIX |
 #endif
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index e37916cbc18d..4c6b68ef571c 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -64,7 +64,8 @@ static inline void __tlbie(unsigned long vpn, int psize, int 
apsize, int ssize)
 * Older versions of the architecture (2.02 and earler) require the
 * masking of the top 16 bits.
 */
-   va &= ~(0xULL << 48);
+   if (mmu_has_feature(MMU_FTR_TLBIE_CROP_VA))
+   va &= ~(0xULL << 48);
 
switch (psize) {
case MMU_PAGE_4K:
@@ -113,7 +114,8 @@ static inline void __tlbiel(unsigned long vpn, int psize, 
int apsize, int ssize)
 * Older versions of the architecture (2.02 and earler) require the
 * masking of the top 16 bits.
 */
-   va &= ~(0xULL << 48);
+   if (mmu_has_feature(MMU_FTR_TLBIE_CROP_VA))
+   va &= ~(0xULL << 48);
 
switch (psize) {
case MMU_PAGE_4K:
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 09/10] powerpc/mm: Print formation regarding the the MMU mode

2016-06-08 Thread Aneesh Kumar K.V
This helps in easily identifying the MMU mode with which the kernel
is operating.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash_utils_64.c | 3 ++-
 arch/powerpc/mm/pgtable-radix.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index b2740c67e172..bf9b0b80bbfc 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -720,7 +720,7 @@ static void __init hash_init_partition_table(phys_addr_t 
hash_table,
 * For now UPRT is 0 for us.
 */
partition_tb->patb1 = 0;
-   DBG("Partition table %p\n", partition_tb);
+   pr_info("Partition table %p\n", partition_tb);
/*
 * update partition table control register,
 * 64 K size.
@@ -924,6 +924,7 @@ void __init hash__early_init_mmu(void)
 */
htab_initialize();
 
+   pr_info("Initializing hash mmu with SLB\n");
/* Initialize SLB management */
slb_initialize();
 }
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 73aa402047ef..d6598cd1c3e6 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -185,7 +185,8 @@ static void __init radix_init_partition_table(void)
partition_tb = early_alloc_pgtable(1UL << PATB_SIZE_SHIFT);
partition_tb->patb0 = cpu_to_be64(rts_field | __pa(init_mm.pgd) |
  RADIX_PGD_INDEX_SIZE | PATB_HR);
-   printk("Partition table %p\n", partition_tb);
+   pr_info("Initializing Radix MMU\n");
+   pr_info("Partition table %p\n", partition_tb);
 
memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
/*
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 10/10] powerpc/mm/hash: Update SDR1 size encoding as documented in ISA 3.0

2016-06-08 Thread Aneesh Kumar K.V
ISA 3.0 document hash table size in bytes = 2^(HTABSIZE + 18)

No functionality change by this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash_utils_64.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index bf9b0b80bbfc..7cce2f6169fa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -695,10 +695,9 @@ int remove_section_mapping(unsigned long start, unsigned 
long end)
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
 static void __init hash_init_partition_table(phys_addr_t hash_table,
-unsigned long pteg_count)
+unsigned long htab_size)
 {
unsigned long ps_field;
-   unsigned long htab_size;
unsigned long patb_size = 1UL << PATB_SIZE_SHIFT;
 
/*
@@ -706,7 +705,7 @@ static void __init hash_init_partition_table(phys_addr_t 
hash_table,
 * We can ignore that for lpid 0
 */
ps_field = 0;
-   htab_size =  __ilog2(pteg_count) - 11;
+   htab_size =  __ilog2(htab_size) - 18;
 
BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 24), "Partition table size too 
large.");
partition_tb = __va(memblock_alloc_base(patb_size, patb_size,
@@ -792,7 +791,7 @@ static void __init htab_initialize(void)
htab_address = __va(table);
 
/* htab absolute addr + encoded htabsize */
-   _SDR1 = table + __ilog2(pteg_count) - 11;
+   _SDR1 = table + __ilog2(htab_size_bytes) - 18;
 
/* Initialize the HPT with no entries */
memset((void *)table, 0, htab_size_bytes);
@@ -801,7 +800,7 @@ static void __init htab_initialize(void)
/* Set SDR1 */
mtspr(SPRN_SDR1, _SDR1);
else
-   hash_init_partition_table(table, pteg_count);
+   hash_init_partition_table(table, htab_size_bytes);
}
 
prot = pgprot_val(PAGE_KERNEL);
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 00/16] TLB flush improvments and Segment table support

2016-06-08 Thread Aneesh Kumar K.V
This series include patches which got posted earlier as independent series.
Some of this patches will go upstream via -mm tree. 

Changes from V1:
* Address review feedback
* rebase on top of radix fixes which got posted earlier
* Fixes for segment table support.

NOTE:
Even though the patch series include changes to generic mm and other 
architectures
this series is not cross-posted. That is because, the generic mm changes got
posted as a separate patch series which can be found at
http://thread.gmane.org/gmane.linux.kernel.mm/152620 


Aneesh Kumar K.V (16):
  mm/hugetlb: Simplify hugetlb unmap
  mm: Change the interface for __tlb_remove_page
  mm/mmu_gather: Track page size with mmu gather and force flush if page
size change
  powerpc/mm/radix: Implement tlb mmu gather flush efficiently
  powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
  powerpc/mm/hash: Add helper for finding SLBE LLP encoding
  powerpc/mm: Use hugetlb flush functions
  powerpc/mm: Drop multiple definition of mm_is_core_local
  powerpc/mm/radix: Add tlb flush of THP ptes
  powerpc/mm/radix: Rename function and drop unused arg
  powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
  powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
  powerpc/mm: remove flush_tlb_page_nohash
  powerpc/mm: Cleanup LPCR defines
  powerpc/mm: Switch user slb fault handling to translation enabled
  powerpc/mm: Support segment table for Power9

 arch/arm/include/asm/tlb.h |  29 +-
 arch/ia64/include/asm/tlb.h|  31 +-
 arch/powerpc/include/asm/book3s/64/hash.h  |  10 +
 arch/powerpc/include/asm/book3s/64/hugetlb-radix.h |  15 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h  |  26 ++
 arch/powerpc/include/asm/book3s/64/mmu.h   |   7 +-
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h |   5 -
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  16 +-
 arch/powerpc/include/asm/book3s/64/tlbflush.h  |  27 +-
 arch/powerpc/include/asm/hugetlb.h |   2 +-
 arch/powerpc/include/asm/kvm_book3s_64.h   |   3 +-
 arch/powerpc/include/asm/mmu.h |  18 +-
 arch/powerpc/include/asm/mmu_context.h |   5 +-
 arch/powerpc/include/asm/reg.h |  54 ++--
 arch/powerpc/include/asm/tlb.h |  13 +
 arch/powerpc/include/asm/tlbflush.h|   1 -
 arch/powerpc/kernel/entry_64.S |   2 +-
 arch/powerpc/kernel/exceptions-64s.S   |  63 +++-
 arch/powerpc/kernel/prom.c |   3 +-
 arch/powerpc/mm/hash_native_64.c   |   6 +-
 arch/powerpc/mm/hash_utils_64.c|  86 -
 arch/powerpc/mm/hugetlbpage-radix.c|  39 +--
 arch/powerpc/mm/mmu_context_book3s64.c |  32 +-
 arch/powerpc/mm/pgtable-book3s64.c |   4 +-
 arch/powerpc/mm/pgtable.c  |   2 +-
 arch/powerpc/mm/slb.c  | 359 +
 arch/powerpc/mm/tlb-radix.c| 104 +-
 arch/powerpc/mm/tlb_hash32.c   |  11 -
 arch/powerpc/mm/tlb_nohash.c   |   6 -
 arch/s390/include/asm/tlb.h|  22 +-
 arch/sh/include/asm/tlb.h  |  20 +-
 arch/um/include/asm/tlb.h  |  20 +-
 include/asm-generic/tlb.h  |  59 +++-
 mm/huge_memory.c   |   2 +-
 mm/hugetlb.c   |  64 ++--
 mm/memory.c|  27 +-
 36 files changed, 981 insertions(+), 212 deletions(-)

-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 01/16] mm/hugetlb: Simplify hugetlb unmap

2016-06-08 Thread Aneesh Kumar K.V
For hugetlb like THP (and unlike regular page), we do tlb flush after
dropping ptl. Because of the above, we don't need to track force_flush
like we do now. Instead we can simply call tlb_remove_page() which
will do the flush if needed.

No functionality change in this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 mm/hugetlb.c | 54 +-
 1 file changed, 21 insertions(+), 33 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d26162e81fea..741429d01668 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3138,7 +3138,6 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, 
struct vm_area_struct *vma,
unsigned long start, unsigned long end,
struct page *ref_page)
 {
-   int force_flush = 0;
struct mm_struct *mm = vma->vm_mm;
unsigned long address;
pte_t *ptep;
@@ -3157,19 +3156,22 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, 
struct vm_area_struct *vma,
tlb_start_vma(tlb, vma);
mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
address = start;
-again:
for (; address < end; address += sz) {
ptep = huge_pte_offset(mm, address);
if (!ptep)
continue;
 
ptl = huge_pte_lock(h, mm, ptep);
-   if (huge_pmd_unshare(mm, &address, ptep))
-   goto unlock;
+   if (huge_pmd_unshare(mm, &address, ptep)) {
+   spin_unlock(ptl);
+   continue;
+   }
 
pte = huge_ptep_get(ptep);
-   if (huge_pte_none(pte))
-   goto unlock;
+   if (huge_pte_none(pte)) {
+   spin_unlock(ptl);
+   continue;
+   }
 
/*
 * Migrating hugepage or HWPoisoned hugepage is already
@@ -3177,7 +3179,8 @@ again:
 */
if (unlikely(!pte_present(pte))) {
huge_pte_clear(mm, address, ptep);
-   goto unlock;
+   spin_unlock(ptl);
+   continue;
}
 
page = pte_page(pte);
@@ -3187,9 +3190,10 @@ again:
 * are about to unmap is the actual page of interest.
 */
if (ref_page) {
-   if (page != ref_page)
-   goto unlock;
-
+   if (page != ref_page) {
+   spin_unlock(ptl);
+   continue;
+   }
/*
 * Mark the VMA as having unmapped its page so that
 * future faults in this VMA will fail rather than
@@ -3205,30 +3209,14 @@ again:
 
hugetlb_count_sub(pages_per_huge_page(h), mm);
page_remove_rmap(page, true);
-   force_flush = !__tlb_remove_page(tlb, page);
-   if (force_flush) {
-   address += sz;
-   spin_unlock(ptl);
-   break;
-   }
-   /* Bail out after unmapping reference page if supplied */
-   if (ref_page) {
-   spin_unlock(ptl);
-   break;
-   }
-unlock:
+
spin_unlock(ptl);
-   }
-   /*
-* mmu_gather ran out of room to batch pages, we break out of
-* the PTE lock to avoid doing the potential expensive TLB invalidate
-* and page-free while holding it.
-*/
-   if (force_flush) {
-   force_flush = 0;
-   tlb_flush_mmu(tlb);
-   if (address < end && !ref_page)
-   goto again;
+   tlb_remove_page(tlb, page);
+   /*
+* Bail out after unmapping reference page if supplied
+*/
+   if (ref_page)
+   break;
}
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
tlb_end_vma(tlb, vma);
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 02/16] mm: Change the interface for __tlb_remove_page

2016-06-08 Thread Aneesh Kumar K.V
This update the generic and arch specific implementation to return true
if we need to do a tlb flush. That means if a __tlb_remove_page indicate
a flush is needed, the page we try to remove need to be tracked and
added again after the flush. We need to track it because we have already
update the pte to none and we can't just loop back.

This changes is done to enable us to do a tlb_flush when we try to flush
a range that consists of different page sizes. For architectures like
ppc64, we can do a range based tlb flush and we need to track page size
for that. When we try to remove a huge page, we will force a tlb flush
and starts a new mmu gather.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/arm/include/asm/tlb.h  | 17 +
 arch/ia64/include/asm/tlb.h | 19 ++-
 arch/s390/include/asm/tlb.h |  9 +++--
 arch/sh/include/asm/tlb.h   |  8 +++-
 arch/um/include/asm/tlb.h   |  8 +++-
 include/asm-generic/tlb.h   | 44 +---
 mm/memory.c | 19 +--
 7 files changed, 94 insertions(+), 30 deletions(-)

diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 3cadb726ec88..a9d2aee3826f 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -209,17 +209,26 @@ tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct 
*vma)
tlb_flush(tlb);
 }
 
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
 {
+   if (tlb->nr == tlb->max)
+   return true;
tlb->pages[tlb->nr++] = page;
-   VM_BUG_ON(tlb->nr > tlb->max);
-   return tlb->max - tlb->nr;
+   return false;
 }
 
 static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
 {
-   if (!__tlb_remove_page(tlb, page))
+   if (__tlb_remove_page(tlb, page)) {
tlb_flush_mmu(tlb);
+   __tlb_remove_page(tlb, page);
+   }
+}
+
+static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
+struct page *page)
+{
+   return __tlb_remove_page(tlb, page);
 }
 
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
diff --git a/arch/ia64/include/asm/tlb.h b/arch/ia64/include/asm/tlb.h
index 39d64e0df1de..e7da41aa9110 100644
--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -205,17 +205,18 @@ tlb_finish_mmu(struct mmu_gather *tlb, unsigned long 
start, unsigned long end)
  * must be delayed until after the TLB has been flushed (see comments at the 
beginning of
  * this file).
  */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
 {
+   if (tlb->nr == tlb->max)
+   return true;
+
tlb->need_flush = 1;
 
if (!tlb->nr && tlb->pages == tlb->local)
__tlb_alloc_page(tlb);
 
tlb->pages[tlb->nr++] = page;
-   VM_BUG_ON(tlb->nr > tlb->max);
-
-   return tlb->max - tlb->nr;
+   return false;
 }
 
 static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
@@ -235,8 +236,16 @@ static inline void tlb_flush_mmu(struct mmu_gather *tlb)
 
 static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
 {
-   if (!__tlb_remove_page(tlb, page))
+   if (__tlb_remove_page(tlb, page)) {
tlb_flush_mmu(tlb);
+   __tlb_remove_page(tlb, page);
+   }
+}
+
+static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
+struct page *page)
+{
+   return __tlb_remove_page(tlb, page);
 }
 
 /*
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 7a92e69c50bc..30759b560849 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -87,10 +87,10 @@ static inline void tlb_finish_mmu(struct mmu_gather *tlb,
  * tlb_ptep_clear_flush. In both flush modes the tlb for a page cache page
  * has already been freed, so just do free_page_and_swap_cache.
  */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
 {
free_page_and_swap_cache(page);
-   return 1; /* avoid calling tlb_flush_mmu */
+   return false; /* avoid calling tlb_flush_mmu */
 }
 
 static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
@@ -98,6 +98,11 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, 
struct page *page)
free_page_and_swap_cache(page);
 }
 
+static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
+struct page *page)
+{
+   return __tlb_remove_page(tlb, page);
+}
 /*
  * pte_free_tlb frees a pte table and clears the CRSTE for the
  * page table from the tlb.
diff -

[PATCH V2 03/16] mm/mmu_gather: Track page size with mmu gather and force flush if page size change

2016-06-08 Thread Aneesh Kumar K.V
This allows arch which need to do special handing with respect to
different page size when flushing tlb to implement the same in mmu gather

Signed-off-by: Aneesh Kumar K.V 
---
 arch/arm/include/asm/tlb.h  | 12 
 arch/ia64/include/asm/tlb.h | 12 
 arch/s390/include/asm/tlb.h | 13 +
 arch/sh/include/asm/tlb.h   | 12 
 arch/um/include/asm/tlb.h   | 12 
 include/asm-generic/tlb.h   | 27 +--
 mm/huge_memory.c|  2 +-
 mm/hugetlb.c|  2 +-
 mm/memory.c | 10 +-
 9 files changed, 93 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index a9d2aee3826f..1e25cd80589e 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -225,12 +225,24 @@ static inline void tlb_remove_page(struct mmu_gather 
*tlb, struct page *page)
}
 }
 
+static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
+ struct page *page, int page_size)
+{
+   return __tlb_remove_page(tlb, page);
+}
+
 static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
 struct page *page)
 {
return __tlb_remove_page(tlb, page);
 }
 
+static inline void tlb_remove_page_size(struct mmu_gather *tlb,
+   struct page *page, int page_size)
+{
+   return tlb_remove_page(tlb, page);
+}
+
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
unsigned long addr)
 {
diff --git a/arch/ia64/include/asm/tlb.h b/arch/ia64/include/asm/tlb.h
index e7da41aa9110..77e541cf0e5d 100644
--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -242,12 +242,24 @@ static inline void tlb_remove_page(struct mmu_gather 
*tlb, struct page *page)
}
 }
 
+static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
+ struct page *page, int page_size)
+{
+   return __tlb_remove_page(tlb, page);
+}
+
 static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
 struct page *page)
 {
return __tlb_remove_page(tlb, page);
 }
 
+static inline void tlb_remove_page_size(struct mmu_gather *tlb,
+   struct page *page, int page_size)
+{
+   return tlb_remove_page(tlb, page);
+}
+
 /*
  * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called 
for any
  * PTE, not just those pointing to (normal) physical memory.
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 30759b560849..15711de10403 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -98,11 +98,24 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, 
struct page *page)
free_page_and_swap_cache(page);
 }
 
+static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
+ struct page *page, int page_size)
+{
+   return __tlb_remove_page(tlb, page);
+}
+
 static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
 struct page *page)
 {
return __tlb_remove_page(tlb, page);
 }
+
+static inline void tlb_remove_page_size(struct mmu_gather *tlb,
+   struct page *page, int page_size)
+{
+   return tlb_remove_page(tlb, page);
+}
+
 /*
  * pte_free_tlb frees a pte table and clears the CRSTE for the
  * page table from the tlb.
diff --git a/arch/sh/include/asm/tlb.h b/arch/sh/include/asm/tlb.h
index 21ae8f5546b2..025cdb1032f6 100644
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -109,12 +109,24 @@ static inline void tlb_remove_page(struct mmu_gather 
*tlb, struct page *page)
__tlb_remove_page(tlb, page);
 }
 
+static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
+ struct page *page, int page_size)
+{
+   return __tlb_remove_page(tlb, page);
+}
+
 static inline bool __tlb_remove_pte_page(struct mmu_gather *tlb,
 struct page *page)
 {
return __tlb_remove_page(tlb, page);
 }
 
+static inline void tlb_remove_page_size(struct mmu_gather *tlb,
+   struct page *page, int page_size)
+{
+   return tlb_remove_page(tlb, page);
+}
+
 #define pte_free_tlb(tlb, ptep, addr)  pte_free((tlb)->mm, ptep)
 #define pmd_free_tlb(tlb, pmdp, addr)  pmd_free((tlb)->mm, pmdp)
 #define pud_free_tlb(tlb, pudp, addr)  pud_free((tlb)->mm, pudp)
diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h
index 3dc4cbb3c2c0..821ff0acfe17 100644
--- a/arch/um/include/asm/tlb.h
+++ b/arch/um/include/asm/tlb.h
@@ -110,12 +110,24 @@ static inline void tlb_remove_page(struct mmu_gather 
*tlb, struct page *page)
__tlb_remove_page(tlb, page);
 }

[PATCH V2 04/16] powerpc/mm/radix: Implement tlb mmu gather flush efficiently

2016-06-08 Thread Aneesh Kumar K.V
Now that we track page size in mmu_gather, we can use address based
tlbie format when doing a tlb_flush(). We don't do this if we are
invalidating the full address space.

Signed-off-by: Aneesh Kumar K.V 
---
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  2 +
 arch/powerpc/mm/tlb-radix.c| 73 +-
 2 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 3fa94fcac628..862c8fa50268 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -10,6 +10,8 @@ static inline int mmu_get_ap(int psize)
return mmu_psize_defs[psize].ap;
 }
 
+extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long 
start,
+unsigned long end, int psize);
 extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long 
start,
unsigned long end);
 extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long 
end);
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 231e3ed2e684..03e719ee6747 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -279,9 +279,80 @@ void radix__flush_tlb_range(struct vm_area_struct *vma, 
unsigned long start,
 }
 EXPORT_SYMBOL(radix__flush_tlb_range);
 
+static int radix_get_mmu_psize(int page_size)
+{
+   int psize;
+
+   if (page_size == (1UL << mmu_psize_defs[mmu_virtual_psize].shift))
+   psize = mmu_virtual_psize;
+   else if (page_size == (1UL << mmu_psize_defs[MMU_PAGE_2M].shift))
+   psize = MMU_PAGE_2M;
+   else if (page_size == (1UL << mmu_psize_defs[MMU_PAGE_1G].shift))
+   psize = MMU_PAGE_1G;
+   else
+   return -1;
+   return psize;
+}
 
 void radix__tlb_flush(struct mmu_gather *tlb)
 {
+   int psize = 0;
struct mm_struct *mm = tlb->mm;
-   radix__flush_tlb_mm(mm);
+   int page_size = tlb->page_size;
+
+   psize = radix_get_mmu_psize(page_size);
+   /*
+* if page size is not something we understand, do a full mm flush
+*/
+   if (psize != -1 && !tlb->fullmm && !tlb->need_flush_all)
+   radix__flush_tlb_range_psize(mm, tlb->start, tlb->end, psize);
+   else
+   radix__flush_tlb_mm(mm);
+}
+
+#define TLB_FLUSH_ALL -1UL
+/*
+ * Number of pages above which we will do a bcast tlbie. Just a
+ * number at this point copied from x86
+ */
+static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
+
+void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start,
+ unsigned long end, int psize)
+{
+   unsigned long pid;
+   unsigned long addr;
+   int local = mm_is_core_local(mm);
+   unsigned long ap = mmu_get_ap(psize);
+   int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
+   unsigned long page_size = 1UL << mmu_psize_defs[psize].shift;
+
+
+   preempt_disable();
+   pid = mm ? mm->context.id : 0;
+   if (unlikely(pid == MMU_NO_CONTEXT))
+   goto err_out;
+
+   if (end == TLB_FLUSH_ALL ||
+   (end - start) > tlb_single_page_flush_ceiling * page_size) {
+   if (local)
+   _tlbiel_pid(pid, RIC_FLUSH_TLB);
+   else
+   _tlbie_pid(pid, RIC_FLUSH_TLB);
+   goto err_out;
+   }
+   for (addr = start; addr < end; addr += page_size) {
+
+   if (local)
+   _tlbiel_va(addr, pid, ap, RIC_FLUSH_TLB);
+   else {
+   if (lock_tlbie)
+   raw_spin_lock(&native_tlbie_lock);
+   _tlbie_va(addr, pid, ap, RIC_FLUSH_TLB);
+   if (lock_tlbie)
+   raw_spin_unlock(&native_tlbie_lock);
+   }
+   }
+err_out:
+   preempt_enable();
 }
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 05/16] powerpc/mm: Make MMU_FTR_RADIX a MMU family feature

2016-06-08 Thread Aneesh Kumar K.V
MMU feature bits are defined such that we use the lower half to
present MMU family features. Remove the strict split of half and
also move Radix to a mmu family feature. Radix introduce a new MMU
model and strictly speaking it is a new MMU family. This also free
up bits which can be used for individual features later.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |  3 +--
 arch/powerpc/include/asm/mmu.h   | 16 +++-
 arch/powerpc/kernel/entry_64.S   |  2 +-
 arch/powerpc/kernel/exceptions-64s.S |  8 
 arch/powerpc/kernel/prom.c   |  2 +-
 5 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index d4eda6420523..6d8306d9aa7a 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -24,12 +24,11 @@ struct mmu_psize_def {
 extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
 #ifdef CONFIG_PPC_RADIX_MMU
-#define radix_enabled() mmu_has_feature(MMU_FTR_RADIX)
+#define radix_enabled() mmu_has_feature(MMU_FTR_TYPE_RADIX)
 #else
 #define radix_enabled() (0)
 #endif
 
-
 #endif /* __ASSEMBLY__ */
 
 /* 64-bit classic hash table MMU */
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 616575fcbcc7..21b71469e66b 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -12,7 +12,7 @@
  */
 
 /*
- * First half is MMU families
+ * MMU families
  */
 #define MMU_FTR_HPTE_TABLE ASM_CONST(0x0001)
 #define MMU_FTR_TYPE_8xx   ASM_CONST(0x0002)
@@ -20,9 +20,12 @@
 #define MMU_FTR_TYPE_44x   ASM_CONST(0x0008)
 #define MMU_FTR_TYPE_FSL_E ASM_CONST(0x0010)
 #define MMU_FTR_TYPE_47x   ASM_CONST(0x0020)
-
 /*
- * This is individual features
+ * Radix page table available
+ */
+#define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040)
+/*
+ * individual features
  */
 /*
  * We need to clear top 16bits of va (from the remaining 64 bits )in
@@ -93,11 +96,6 @@
  */
 #define MMU_FTR_1T_SEGMENT ASM_CONST(0x4000)
 
-/*
- * Radix page table available
- */
-#define MMU_FTR_RADIX  ASM_CONST(0x8000)
-
 /* MMU feature bit sets for various CPUs */
 #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2  \
MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2
@@ -132,7 +130,7 @@ enum {
MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE |
MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA |
 #ifdef CONFIG_PPC_RADIX_MMU
-   MMU_FTR_RADIX |
+   MMU_FTR_TYPE_RADIX |
 #endif
0,
 };
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 73e461a3dfbb..dd26d4ed7513 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -532,7 +532,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
 #ifdef CONFIG_PPC_STD_MMU_64
 BEGIN_MMU_FTR_SECTION
b   2f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
 BEGIN_FTR_SECTION
clrrdi  r6,r8,28/* get its ESID */
clrrdi  r9,r1,28/* get current sp ESID */
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 4c9440629128..f2bd375b9a4e 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -945,7 +945,7 @@ BEGIN_MMU_FTR_SECTION
b   do_hash_page/* Try to handle as hpte fault */
 MMU_FTR_SECTION_ELSE
b   handle_page_fault
-ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
.align  7
.globl  h_data_storage_common
@@ -976,7 +976,7 @@ BEGIN_MMU_FTR_SECTION
b   do_hash_page/* Try to handle as hpte fault */
 MMU_FTR_SECTION_ELSE
b   handle_page_fault
-ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
STD_EXCEPTION_COMMON(0xe20, h_instr_storage, unknown_exception)
 
@@ -1390,7 +1390,7 @@ slb_miss_realmode:
 #ifdef CONFIG_PPC_STD_MMU_64
 BEGIN_MMU_FTR_SECTION
bl  slb_allocate_realmode
-END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
 #endif
/* All done -- return from exception. */
 
@@ -1401,7 +1401,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
mtlrr10
 BEGIN_MMU_FTR_SECTION
b   2f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
andi.   r10,r12,MSR_RI  /* check for unrecoverable exception */
beq-2f
 
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 946e34ffeae9..44b4417804db 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -168,7 +168,7 @@ static struct ibm_pa_feature {
 */
{

[PATCH V2 06/16] powerpc/mm/hash: Add helper for finding SLBE LLP encoding

2016-06-08 Thread Aneesh Kumar K.V
Replace opencoding of the same at multiple places with the helper.
No functional change with this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 9 +
 arch/powerpc/include/asm/kvm_book3s_64.h  | 3 +--
 arch/powerpc/mm/hash_native_64.c  | 6 ++
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 74839f24f412..b042e5f9a428 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -151,6 +151,15 @@ static inline unsigned int mmu_psize_to_shift(unsigned int 
mmu_psize)
BUG();
 }
 
+static inline unsigned long get_sllp_encoding(int psize)
+{
+   unsigned long sllp;
+
+   sllp = ((mmu_psize_defs[psize].sllp & SLB_VSID_L) >> 6) |
+   ((mmu_psize_defs[psize].sllp & SLB_VSID_LP) >> 4);
+   return sllp;
+}
+
 #endif /* __ASSEMBLY__ */
 
 /*
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 1f4497fb5b83..88d17b4ea9c8 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -181,8 +181,7 @@ static inline unsigned long compute_tlbie_rb(unsigned long 
v, unsigned long r,
 
switch (b_psize) {
case MMU_PAGE_4K:
-   sllp = ((mmu_psize_defs[a_psize].sllp & SLB_VSID_L) >> 6) |
-   ((mmu_psize_defs[a_psize].sllp & SLB_VSID_LP) >> 4);
+   sllp = get_sllp_encoding(a_psize);
rb |= sllp << 5;/*  AP field */
rb |= (va_low & 0x7ff) << 12;   /* remaining 11 bits of AVA */
break;
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 4c6d4c736ba4..fd483948981a 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -72,8 +72,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int 
apsize, int ssize)
/* clear out bits after (52) [052.63] */
va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
-   sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) |
-   ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4);
+   sllp = get_sllp_encoding(apsize);
va |= sllp << 5;
asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2)
 : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
@@ -122,8 +121,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, 
int apsize, int ssize)
/* clear out bits after(52) [052.63] */
va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
-   sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) |
-   ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4);
+   sllp = get_sllp_encoding(apsize);
va |= sllp << 5;
asm volatile(".long 0x7c000224 | (%0 << 11) | (0 << 21)"
 : : "r"(va) : "memory");
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 07/16] powerpc/mm: Use hugetlb flush functions

2016-06-08 Thread Aneesh Kumar K.V
Use flush_hugetlb_page instead of flush_tlb_page when we clear flush the
pte.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/hugetlb.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index e2d9f4996e5c..c5517f463ec7 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -147,7 +147,7 @@ static inline void huge_ptep_clear_flush(struct 
vm_area_struct *vma,
 {
pte_t pte;
pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
-   flush_tlb_page(vma, addr);
+   flush_hugetlb_page(vma, addr);
 }
 
 static inline int huge_pte_none(pte_t pte)
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 08/16] powerpc/mm: Drop multiple definition of mm_is_core_local

2016-06-08 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/tlb.h | 13 +
 arch/powerpc/mm/tlb-radix.c|  6 --
 arch/powerpc/mm/tlb_nohash.c   |  6 --
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
index 20733fa518ae..f6f68f73e858 100644
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -46,5 +46,18 @@ static inline void __tlb_remove_tlb_entry(struct mmu_gather 
*tlb, pte_t *ptep,
 #endif
 }
 
+#ifdef CONFIG_SMP
+static inline int mm_is_core_local(struct mm_struct *mm)
+{
+   return cpumask_subset(mm_cpumask(mm),
+ topology_sibling_cpumask(smp_processor_id()));
+}
+#else
+static inline int mm_is_core_local(struct mm_struct *mm)
+{
+   return 1;
+}
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_TLB_H */
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 03e719ee6747..74b0c90045ab 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -163,12 +163,6 @@ void radix__local_flush_tlb_page(struct vm_area_struct 
*vma, unsigned long vmadd
 EXPORT_SYMBOL(radix__local_flush_tlb_page);
 
 #ifdef CONFIG_SMP
-static int mm_is_core_local(struct mm_struct *mm)
-{
-   return cpumask_subset(mm_cpumask(mm),
- topology_sibling_cpumask(smp_processor_id()));
-}
-
 void radix__flush_tlb_mm(struct mm_struct *mm)
 {
unsigned long pid;
diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index f4668488512c..050badc0ebd3 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -215,12 +215,6 @@ EXPORT_SYMBOL(local_flush_tlb_page);
 
 static DEFINE_RAW_SPINLOCK(tlbivax_lock);
 
-static int mm_is_core_local(struct mm_struct *mm)
-{
-   return cpumask_subset(mm_cpumask(mm),
- topology_sibling_cpumask(smp_processor_id()));
-}
-
 struct tlb_flush_param {
unsigned long addr;
unsigned int pid;
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 09/16] powerpc/mm/radix: Add tlb flush of THP ptes

2016-06-08 Thread Aneesh Kumar K.V
Instead of flushing the entire mm, implement a flush_pmd_tlb_range

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 ++
 arch/powerpc/include/asm/book3s/64/tlbflush.h   | 9 +
 arch/powerpc/mm/pgtable-book3s64.c  | 4 ++--
 arch/powerpc/mm/tlb-radix.c | 7 +++
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 862c8fa50268..54c0aac39e3e 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -12,6 +12,8 @@ static inline int mmu_get_ap(int psize)
 
 extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long 
start,
 unsigned long end, int psize);
+extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long end);
 extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long 
start,
unsigned long end);
 extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long 
end);
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 541cf809e38e..5f322e0ed385 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -7,6 +7,15 @@
 #include 
 #include 
 
+#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
+static inline void flush_pmd_tlb_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long end)
+{
+   if (radix_enabled())
+   return radix__flush_pmd_tlb_range(vma, start, end);
+   return hash__flush_tlb_range(vma, start, end);
+}
+
 static inline void flush_tlb_range(struct vm_area_struct *vma,
   unsigned long start, unsigned long end)
 {
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 670318766545..7bb8acffe876 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -33,7 +33,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, 
unsigned long address,
changed = !pmd_same(*(pmdp), entry);
if (changed) {
__ptep_set_access_flags(pmdp_ptep(pmdp), pmd_pte(entry));
-   flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+   flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
}
return changed;
 }
@@ -66,7 +66,7 @@ void pmdp_invalidate(struct vm_area_struct *vma, unsigned 
long address,
 pmd_t *pmdp)
 {
pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, 0);
-   flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+   flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
/*
 * This ensures that generic code that rely on IRQ disabling
 * to prevent a parallel THP split work as expected.
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 74b0c90045ab..4212e7638a6f 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -350,3 +350,10 @@ void radix__flush_tlb_range_psize(struct mm_struct *mm, 
unsigned long start,
 err_out:
preempt_enable();
 }
+
+void radix__flush_pmd_tlb_range(struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+   radix__flush_tlb_range_psize(vma->vm_mm, start, end, MMU_PAGE_2M);
+}
+EXPORT_SYMBOL(radix__flush_pmd_tlb_range);
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 10/16] powerpc/mm/radix: Rename function and drop unused arg

2016-06-08 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 10 +-
 arch/powerpc/mm/hugetlbpage-radix.c |  4 ++--
 arch/powerpc/mm/tlb-radix.c | 16 
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 54c0aac39e3e..00c354064280 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -20,20 +20,20 @@ extern void radix__flush_tlb_kernel_range(unsigned long 
start, unsigned long end
 
 extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
 extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned 
long vmaddr);
-extern void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long 
vmaddr,
-   unsigned long ap, int nid);
 extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long 
addr);
+extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned 
long vmaddr,
+ unsigned long ap);
 extern void radix__tlb_flush(struct mmu_gather *tlb);
 #ifdef CONFIG_SMP
 extern void radix__flush_tlb_mm(struct mm_struct *mm);
 extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr);
-extern void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr,
- unsigned long ap, int nid);
 extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr);
+extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
+   unsigned long ap);
 #else
 #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
-#define radix___flush_tlb_page(mm,addr,p,i)
radix___local_flush_tlb_page(mm,addr,p,i)
+#define radix__flush_tlb_page_psize(mm,addr,p) 
radix__local_flush_tlb_page_psize(mm,addr,p)
 #define radix__flush_tlb_pwc(tlb, addr)radix__local_flush_tlb_pwc(tlb, 
addr)
 #endif
 
diff --git a/arch/powerpc/mm/hugetlbpage-radix.c 
b/arch/powerpc/mm/hugetlbpage-radix.c
index 1e11559e1aac..0dfa1816f0c6 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -20,7 +20,7 @@ void radix__flush_hugetlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
WARN(1, "Wrong huge page shift\n");
return ;
}
-   radix___flush_tlb_page(vma->vm_mm, vmaddr, ap, 0);
+   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
 }
 
 void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
@@ -37,7 +37,7 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct 
*vma, unsigned long v
WARN(1, "Wrong huge page shift\n");
return ;
}
-   radix___local_flush_tlb_page(vma->vm_mm, vmaddr, ap, 0);
+   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
 }
 
 /*
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 4212e7638a6f..c33c3f24bad2 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -138,8 +138,8 @@ void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, 
unsigned long addr)
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_pwc);
 
-void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr,
-   unsigned long ap, int nid)
+void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
+  unsigned long ap)
 {
unsigned long pid;
 
@@ -157,8 +157,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct 
*vma, unsigned long vmadd
if (vma && is_vm_hugetlb_page(vma))
return __local_flush_hugetlb_page(vma, vmaddr);
 #endif
-   radix___local_flush_tlb_page(vma ? vma->vm_mm : NULL, vmaddr,
-  mmu_get_ap(mmu_virtual_psize), 0);
+   radix__local_flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr,
+ mmu_get_ap(mmu_virtual_psize));
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_page);
 
@@ -212,8 +212,8 @@ no_context:
 }
 EXPORT_SYMBOL(radix__flush_tlb_pwc);
 
-void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr,
-  unsigned long ap, int nid)
+void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr,
+unsigned long ap)
 {
unsigned long pid;
 
@@ -241,8 +241,8 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
if (vma && is_vm_hugetlb_page(vma))
return flush_hugetlb_page(vma, vmaddr);
 #endif
-   radix___flush_tlb_page(vma ? vma->vm_mm 

[PATCH V2 11/16] powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate

2016-06-08 Thread Aneesh Kumar K.V
Use the helper instead of open coding the same at multiple place

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hugetlb-radix.h | 15 +++
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  4 +--
 arch/powerpc/mm/hugetlbpage-radix.c| 29 ++
 arch/powerpc/mm/tlb-radix.c| 10 +---
 4 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
index 60f47649306f..c45189aa7476 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
@@ -11,4 +11,19 @@ extern unsigned long
 radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff,
unsigned long flags);
+
+static inline int hstate_get_psize(struct hstate *hstate)
+{
+   unsigned long shift;
+
+   shift = huge_page_shift(hstate);
+   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
+   return MMU_PAGE_2M;
+   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
+   return MMU_PAGE_1G;
+   else {
+   WARN(1, "Wrong huge page shift\n");
+   return mmu_virtual_psize;
+   }
+}
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 00c354064280..efb13bbc6df2 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -22,14 +22,14 @@ extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
 extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned 
long vmaddr);
 extern void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, unsigned long 
addr);
 extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned 
long vmaddr,
- unsigned long ap);
+ int psize);
 extern void radix__tlb_flush(struct mmu_gather *tlb);
 #ifdef CONFIG_SMP
 extern void radix__flush_tlb_mm(struct mm_struct *mm);
 extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr);
 extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr);
 extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
-   unsigned long ap);
+   int psize);
 #else
 #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
diff --git a/arch/powerpc/mm/hugetlbpage-radix.c 
b/arch/powerpc/mm/hugetlbpage-radix.c
index 0dfa1816f0c6..1eca0deaf89b 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -5,39 +5,24 @@
 #include 
 #include 
 #include 
+#include 
 
 void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
 {
-   unsigned long ap, shift;
+   int psize;
struct hstate *hstate = hstate_file(vma->vm_file);
 
-   shift = huge_page_shift(hstate);
-   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-   ap = mmu_get_ap(MMU_PAGE_2M);
-   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
-   ap = mmu_get_ap(MMU_PAGE_1G);
-   else {
-   WARN(1, "Wrong huge page shift\n");
-   return ;
-   }
-   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
+   psize = hstate_get_psize(hstate);
+   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, psize);
 }
 
 void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
 {
-   unsigned long ap, shift;
+   int psize;
struct hstate *hstate = hstate_file(vma->vm_file);
 
-   shift = huge_page_shift(hstate);
-   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-   ap = mmu_get_ap(MMU_PAGE_2M);
-   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
-   ap = mmu_get_ap(MMU_PAGE_1G);
-   else {
-   WARN(1, "Wrong huge page shift\n");
-   return ;
-   }
-   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
+   psize = hstate_get_psize(hstate);
+   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize);
 }
 
 /*
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index c33c3f24bad2..a32d8aab2376 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -139,9 +139,10 @@ void radix__local_flush_tlb_pwc(struct mmu_gather *tlb, 
unsigned long addr)
 EXPORT_SYMBOL(radix__local_flush_tlb_pwc);
 
 void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
-

[PATCH V2 12/16] powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range

2016-06-08 Thread Aneesh Kumar K.V
Some archs like ppc64 need to do special things when flushing tlb for
hugepage. Add a new helper to flush hugetlb tlb range. This helps us to
avoid flushing the entire tlb mapping for the pid.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h |  2 ++
 arch/powerpc/include/asm/book3s/64/tlbflush.h   | 10 ++
 arch/powerpc/mm/hugetlbpage-radix.c | 10 ++
 mm/hugetlb.c| 10 +-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index efb13bbc6df2..91178f0f5ad8 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -10,6 +10,8 @@ static inline int mmu_get_ap(int psize)
return mmu_psize_defs[psize].ap;
 }
 
+extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long 
end);
 extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long 
start,
 unsigned long end, int psize);
 extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma,
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 5f322e0ed385..02cd7def893d 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -16,6 +16,16 @@ static inline void flush_pmd_tlb_range(struct vm_area_struct 
*vma,
return hash__flush_tlb_range(vma, start, end);
 }
 
+#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
+static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+  unsigned long start,
+  unsigned long end)
+{
+   if (radix_enabled())
+   return radix__flush_hugetlb_tlb_range(vma, start, end);
+   return hash__flush_tlb_range(vma, start, end);
+}
+
 static inline void flush_tlb_range(struct vm_area_struct *vma,
   unsigned long start, unsigned long end)
 {
diff --git a/arch/powerpc/mm/hugetlbpage-radix.c 
b/arch/powerpc/mm/hugetlbpage-radix.c
index 1eca0deaf89b..35254a678456 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -25,6 +25,16 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct 
*vma, unsigned long v
radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize);
 }
 
+void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long 
start,
+  unsigned long end)
+{
+   int psize;
+   struct hstate *hstate = hstate_file(vma->vm_file);
+
+   psize = hstate_get_psize(hstate);
+   radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize);
+}
+
 /*
  * A vairant of hugetlb_get_unmapped_area doing topdown search
  * FIXME!! should we do as x86 does or non hugetlb area does ?
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cab0b1861670..3495c519583d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3897,6 +3897,14 @@ same_page:
return i ? i : -EFAULT;
 }
 
+#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
+/*
+ * ARCHes with special requirements for evicting HUGETLB backing TLB entries 
can
+ * implement this.
+ */
+#define flush_hugetlb_tlb_range(vma, addr, end)flush_tlb_range(vma, 
addr, end)
+#endif
+
 unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
unsigned long address, unsigned long end, pgprot_t newprot)
 {
@@ -3957,7 +3965,7 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
 * once we release i_mmap_rwsem, another task can do the final put_page
 * and that page table be reused and filled with junk.
 */
-   flush_tlb_range(vma, start, end);
+   flush_hugetlb_tlb_range(vma, start, end);
mmu_notifier_invalidate_range(mm, start, end);
i_mmap_unlock_write(vma->vm_file->f_mapping);
mmu_notifier_invalidate_range_end(mm, start, end);
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 13/16] powerpc/mm: remove flush_tlb_page_nohash

2016-06-08 Thread Aneesh Kumar K.V
This should be same as flush_tlb_page except for hash32. For hash32
I guess the existing code is wrong, because we don't seem to be
flushing tlb for Hash != 0 case at all. Fix this by switching to
calling flush_tlb_page() which does the right thing by flushing
tlb for both hash and nohash case with hash32

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h |  5 -
 arch/powerpc/include/asm/book3s/64/tlbflush.h  |  8 
 arch/powerpc/include/asm/tlbflush.h|  1 -
 arch/powerpc/mm/pgtable.c  |  2 +-
 arch/powerpc/mm/tlb_hash32.c   | 11 ---
 5 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
index f12ddf5e8de5..2f6373144e2c 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -75,11 +75,6 @@ static inline void hash__flush_tlb_page(struct 
vm_area_struct *vma,
 {
 }
 
-static inline void hash__flush_tlb_page_nohash(struct vm_area_struct *vma,
-  unsigned long vmaddr)
-{
-}
-
 static inline void hash__flush_tlb_range(struct vm_area_struct *vma,
 unsigned long start, unsigned long end)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 02cd7def893d..7f942c361ea9 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -57,14 +57,6 @@ static inline void local_flush_tlb_page(struct 
vm_area_struct *vma,
return hash__local_flush_tlb_page(vma, vmaddr);
 }
 
-static inline void flush_tlb_page_nohash(struct vm_area_struct *vma,
-unsigned long vmaddr)
-{
-   if (radix_enabled())
-   return radix__flush_tlb_page(vma, vmaddr);
-   return hash__flush_tlb_page_nohash(vma, vmaddr);
-}
-
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
if (radix_enabled())
diff --git a/arch/powerpc/include/asm/tlbflush.h 
b/arch/powerpc/include/asm/tlbflush.h
index 1b38eea28e5a..13dbcd41885e 100644
--- a/arch/powerpc/include/asm/tlbflush.h
+++ b/arch/powerpc/include/asm/tlbflush.h
@@ -54,7 +54,6 @@ extern void __flush_tlb_page(struct mm_struct *mm, unsigned 
long vmaddr,
 #define flush_tlb_page(vma,addr)   local_flush_tlb_page(vma,addr)
 #define __flush_tlb_page(mm,addr,p,i)  __local_flush_tlb_page(mm,addr,p,i)
 #endif
-#define flush_tlb_page_nohash(vma,addr)flush_tlb_page(vma,addr)
 
 #elif defined(CONFIG_PPC_STD_MMU_32)
 
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 88a307504b5a..0b6fb244d0a1 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -225,7 +225,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, 
unsigned long address,
if (!is_vm_hugetlb_page(vma))
assert_pte_locked(vma->vm_mm, address);
__ptep_set_access_flags(ptep, entry);
-   flush_tlb_page_nohash(vma, address);
+   flush_tlb_page(vma, address);
}
return changed;
 }
diff --git a/arch/powerpc/mm/tlb_hash32.c b/arch/powerpc/mm/tlb_hash32.c
index 558e30cce33e..702d7689d714 100644
--- a/arch/powerpc/mm/tlb_hash32.c
+++ b/arch/powerpc/mm/tlb_hash32.c
@@ -49,17 +49,6 @@ void flush_hash_entry(struct mm_struct *mm, pte_t *ptep, 
unsigned long addr)
 EXPORT_SYMBOL(flush_hash_entry);
 
 /*
- * Called by ptep_set_access_flags, must flush on CPUs for which the
- * DSI handler can't just "fixup" the TLB on a write fault
- */
-void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr)
-{
-   if (Hash != 0)
-   return;
-   _tlbie(addr);
-}
-
-/*
  * Called at the end of a mmu_gather operation to make sure the
  * TLB flush is completely done.
  */
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 14/16] powerpc/mm: Cleanup LPCR defines

2016-06-08 Thread Aneesh Kumar K.V
This makes it easy to verify we are not overloading the bits.
No functionality change by this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/reg.h | 54 +-
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 466816ede138..5c4b6f4f8903 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -314,41 +314,41 @@
 #define   HFSCR_FP __MASK(FSCR_FP_LG)
 #define SPRN_TAR   0x32f   /* Target Address Register */
 #define SPRN_LPCR  0x13E   /* LPAR Control Register */
-#define   LPCR_VPM0(1ul << (63-0))
-#define   LPCR_VPM1(1ul << (63-1))
-#define   LPCR_ISL (1ul << (63-2))
+#define   LPCR_VPM0ASM_CONST(0x8000)
+#define   LPCR_VPM1ASM_CONST(0x4000)
+#define   LPCR_ISL ASM_CONST(0x2000)
 #define   LPCR_VC_SH   (63-2)
 #define   LPCR_DPFD_SH (63-11)
 #define   LPCR_DPFD(7ul << LPCR_DPFD_SH)
 #define   LPCR_VRMASD  (0x1ful << (63-16))
-#define   LPCR_VRMA_L  (1ul << (63-12))
-#define   LPCR_VRMA_LP0(1ul << (63-15))
-#define   LPCR_VRMA_LP1(1ul << (63-16))
+#define   LPCR_VRMA_L  ASM_CONST(0x0008)
+#define   LPCR_VRMA_LP0ASM_CONST(0x0001)
+#define   LPCR_VRMA_LP1ASM_CONST(0x8000)
 #define   LPCR_VRMASD_SH (63-16)
-#define   LPCR_RMLS0x1C00  /* impl dependent rmo limit sel */
+#define   LPCR_RMLS 0x1C00  /* impl dependent rmo limit sel */
 #define  LPCR_RMLS_SH  (63-37)
-#define   LPCR_ILE 0x0200  /* !HV irqs set MSR:LE */
-#define   LPCR_AIL 0x0180  /* Alternate interrupt location */
-#define   LPCR_AIL_0   0x  /* MMU off exception offset 0x0 */
-#define   LPCR_AIL_3   0x0180  /* MMU on exception offset 0xc00...4xxx 
*/
-#define   LPCR_ONL 0x0004  /* online - PURR/SPURR count */
-#define   LPCR_PECE0x0001f000  /* powersave exit cause enable */
-#define LPCR_PECEDP0x0001  /* directed priv dbells cause 
exit */
-#define LPCR_PECEDH0x8000  /* directed hyp dbells cause 
exit */
-#define LPCR_PECE0 0x4000  /* ext. exceptions can cause exit */
-#define LPCR_PECE1 0x2000  /* decrementer can cause exit */
-#define LPCR_PECE2 0x1000  /* machine check etc can cause exit */
-#define   LPCR_MER 0x0800  /* Mediated External Exception */
+#define   LPCR_ILE ASM_CONST(0x0200)   /* !HV 
irqs set MSR:LE */
+#define   LPCR_AIL ASM_CONST(0x0180)   /* Alternate 
interrupt location */
+#define   LPCR_AIL_0   ASM_CONST(0x)   /* MMU off 
exception offset 0x0 */
+#define   LPCR_AIL_3   ASM_CONST(0x0180)   /* MMU on 
exception offset 0xc00...4xxx */
+#define   LPCR_ONL ASM_CONST(0x0004)   /* online - 
PURR/SPURR count */
+#define   LPCR_PECEASM_CONST(0x0001f000)   /* powersave 
exit cause enable */
+#define  LPCR_PECEDP   ASM_CONST(0x0001)   /* directed 
priv dbells cause exit */
+#define   LPCR_PECEDH  ASM_CONST(0x8000)   /* directed hyp 
dbells cause exit */
+#define   LPCR_PECE0   ASM_CONST(0x4000)   /* ext. 
exceptions can cause exit */
+#define   LPCR_PECE1   ASM_CONST(0x2000)   /* decrementer 
can cause exit */
+#define   LPCR_PECE2   ASM_CONST(0x1000)   /* machine 
check etc can cause exit */
+#define   LPCR_MER ASM_CONST(0x0800)   /* Mediated 
External Exception */
 #define   LPCR_MER_SH  11
-#define   LPCR_TC  0x0200  /* Translation control */
-#define   LPCR_LPES0x000c
-#define   LPCR_LPES0   0x0008  /* LPAR Env selector 0 */
-#define   LPCR_LPES1   0x0004  /* LPAR Env selector 1 */
+#define   LPCR_TC  ASM_CONST(0x0200)   /* Translation 
control */
+#define   LPCR_LPES0x000c
+#define   LPCR_LPES0   ASM_CONST(0x0008)  /* LPAR Env 
selector 0 */
+#define   LPCR_LPES1   ASM_CONST(0x0004)  /* LPAR Env 
selector 1 */
 #define   LPCR_LPES_SH 2
-#define   LPCR_RMI 0x0002  /* real mode is cache inhibit */
-#define   LPCR_HDICE   0x0001  /* Hyp Decr enable (HV,PR,EE) */
-#define   LPCR_UPRT0x0040  /* Use Process Table (ISA 3) */
-#define  LPCR_HR  0x0010
+#define   LPCR_RMI ASM_CONST(0x0002)  /* real mode 
is cache inhibit */
+#define   LPCR_HDICE   ASM_CONST(0x0001)  /* Hyp Decr 
enable (HV,PR,EE) */
+#define   LPCR_UPRTASM_CONST(0x0040)  /* Use 
Process Table (ISA 3) */
+#define  LPCR

[PATCH V2 15/16] powerpc/mm: Switch user slb fault handling to translation enabled

2016-06-08 Thread Aneesh Kumar K.V
We also handle fault with proper stack initialized. This enable us to
callout to C in fault handling routines. We don't do this for kernel
mapping, because of the possibility of taking recursive fault if kernel
stack in not yet mapped by an slb entry.

This enable us to handle Power9 slb fault better. We will add bolted
entries for the entire kernel mapping in segment table and user slb
entries we take fault and insert on demand. With translation on, we
should be able to access segment table from fault handler.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/kernel/exceptions-64s.S | 55 
 arch/powerpc/mm/slb.c| 11 
 2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index f2bd375b9a4e..2f2c52559ea9 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -794,7 +794,7 @@ data_access_slb_relon_pSeries:
mfspr   r3,SPRN_DAR
mfspr   r12,SPRN_SRR1
 #ifndef CONFIG_RELOCATABLE
-   b   slb_miss_realmode
+   b   handle_slb_miss_relon
 #else
/*
 * We can't just use a direct branch to slb_miss_realmode
@@ -803,7 +803,7 @@ data_access_slb_relon_pSeries:
 */
mfctr   r11
ld  r10,PACAKBASE(r13)
-   LOAD_HANDLER(r10, slb_miss_realmode)
+   LOAD_HANDLER(r10, handle_slb_miss_relon)
mtctr   r10
bctr
 #endif
@@ -819,11 +819,11 @@ instruction_access_slb_relon_pSeries:
mfspr   r3,SPRN_SRR0/* SRR0 is faulting address */
mfspr   r12,SPRN_SRR1
 #ifndef CONFIG_RELOCATABLE
-   b   slb_miss_realmode
+   b   handle_slb_miss_relon
 #else
mfctr   r11
ld  r10,PACAKBASE(r13)
-   LOAD_HANDLER(r10, slb_miss_realmode)
+   LOAD_HANDLER(r10, handle_slb_miss_relon)
mtctr   r10
bctr
 #endif
@@ -961,7 +961,23 @@ h_data_storage_common:
bl  unknown_exception
b   ret_from_except
 
+/* r3 point to DAR */
.align  7
+   .globl slb_miss_user
+slb_miss_user:
+   std r3,PACA_EXSLB+EX_DAR(r13)
+   /* Restore r3 as expected by PROLOG_COMMON below */
+   ld  r3,PACA_EXSLB+EX_R3(r13)
+   EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB)
+   RECONCILE_IRQ_STATE(r10, r11)
+   ld  r4,PACA_EXSLB+EX_DAR(r13)
+   li  r5,0x380
+   std r4,_DAR(r1)
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  handle_slb_miss
+   b   ret_from_except_lite
+
+.align 7
.globl instruction_access_common
 instruction_access_common:
EXCEPTION_PROLOG_COMMON(0x400, PACA_EXGEN)
@@ -1379,11 +1395,17 @@ unrecover_mce:
  * We assume we aren't going to take any exceptions during this procedure.
  */
 slb_miss_realmode:
-   mflrr10
 #ifdef CONFIG_RELOCATABLE
mtctr   r11
 #endif
+   /*
+* Handle user slb miss with translation enabled
+*/
+   cmpdi   r3,0
+   bge 3f
 
+slb_miss_kernel:
+   mflrr10
stw r9,PACA_EXSLB+EX_CCR(r13)   /* save CR in exc. frame */
std r10,PACA_EXSLB+EX_LR(r13)   /* save LR */
 
@@ -1428,6 +1450,29 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
mtspr   SPRN_SRR1,r10
rfid
b   .
+3:
+   /*
+* Enable IR/DR and handle the fault
+*/
+   EXCEPTION_PROLOG_PSERIES_1(slb_miss_user, EXC_STD)
+   /*
+* handler with relocation on
+*/
+handle_slb_miss_relon:
+#ifdef CONFIG_RELOCATABLE
+   mtctr   r11
+#endif
+   /*
+* Handle user slb miss with stack initialized.
+*/
+   cmpdi   r3,0
+   bge 4f
+   /*
+* go back to slb_miss_realmode
+*/
+   b   slb_miss_kernel
+4:
+   EXCEPTION_RELON_PROLOG_PSERIES_1(slb_miss_user, EXC_STD)
 
 unrecov_slb:
EXCEPTION_PROLOG_COMMON(0x4100, PACA_EXSLB)
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 48fc28bab544..b18d7df5601d 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 
+#include 
+
 enum slb_index {
LINEAR_INDEX= 0, /* Kernel linear map  (0xc000) */
VMALLOC_INDEX   = 1, /* Kernel virtual map (0xd000) */
@@ -346,3 +348,12 @@ void slb_initialize(void)
 
asm volatile("isync":::"memory");
 }
+
+void handle_slb_miss(struct pt_regs *regs,
+unsigned long address, unsigned long trap)
+{
+   enum ctx_state prev_state = exception_enter();
+
+   slb_allocate(address);
+   exception_exit(prev_state);
+}
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/nohash: Fix build break with 4K pages

2016-06-08 Thread Aneesh Kumar K.V
Michael Ellerman  writes:

> Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are
> allocating fragments" renamed page_table_free() to pte_fragment_free().
> One occurrence was mistyped as pte_fragment_fre().
>
> This only breaks the nohash 4K page build, which is not the default or
> enabled in any defconfig.

Can you share the .config. I will add it to the build test.

>
> Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are 
> allocating fragments")
> Signed-off-by: Michael Ellerman 


Reviewed-by: Aneesh Kumar K.V 

> ---
>  arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h 
> b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> index 0c12a3bfe2ab..069369f6414b 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> @@ -172,7 +172,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct 
> *mm,
>
>  static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
>  {
> - pte_fragment_fre((unsigned long *)pte, 1);
> + pte_fragment_free((unsigned long *)pte, 1);
>  }
>
>  static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
> -- 
> 2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 16/16] powerpc/mm: Support segment table for Power9

2016-06-08 Thread Aneesh Kumar K.V
"Aneesh Kumar K.V"  writes:

> PowerISA 3.0 adds an in memory table for storing segment translation
> information. In this mode, which is enabled by setting both HOST RADIX
> and GUEST RADIX bits in partition table to 0 and enabling UPRT to
> 1, we have a per process segment table. The segment table details
> are stored in the process table indexed by PID value.
>
> Segment table mode also requires us to map the process table at the
> beginning of a 1TB segment.
>
> On the linux kernel side we enable this model if we find that
> the radix is explicitily disabled by setting the ibm,pa-feature radix
> bit (byte 40 bit 0) set to 0. If the size of ibm,pa-feature node is less
> than 40 bytes, we enable the legacy HPT mode using SLB. If radix bit
> is set to 1, we use the radix mode.


Missed updating the commit message.

On the linux kernel side we enable this model if we find hash mmu bit
(byte 58 bit 0) of ibm,pa-feature device tree node set to 1. If the size
of ibm,pa-feature node is less than 58 bytes or if the hash mmu bit is
set to 0, we enable the legacy HPT mode using SLB. If radix bit (byte 40
bit 0) is set to 1, we use the radix mode.

>
> With respect to SLB mapping, we bolt mapp the entire kernel range and
> and only handle user space segment fault.
>

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 16/16] powerpc/mm: Support segment table for Power9

2016-06-08 Thread Aneesh Kumar K.V
PowerISA 3.0 adds an in memory table for storing segment translation
information. In this mode, which is enabled by setting both HOST RADIX
and GUEST RADIX bits in partition table to 0 and enabling UPRT to
1, we have a per process segment table. The segment table details
are stored in the process table indexed by PID value.

Segment table mode also requires us to map the process table at the
beginning of a 1TB segment.

On the linux kernel side we enable this model if we find that
the radix is explicitily disabled by setting the ibm,pa-feature radix
bit (byte 40 bit 0) set to 0. If the size of ibm,pa-feature node is less
than 40 bytes, we enable the legacy HPT mode using SLB. If radix bit
is set to 1, we use the radix mode.

With respect to SLB mapping, we bolt mapp the entire kernel range and
and only handle user space segment fault.

We also have access to 4 SLB register in software. So we continue to use
3 of that for bolted kernel SLB entries as we use them currently.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h |  10 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  17 ++
 arch/powerpc/include/asm/book3s/64/mmu.h  |   4 +
 arch/powerpc/include/asm/mmu.h|   6 +-
 arch/powerpc/include/asm/mmu_context.h|   5 +-
 arch/powerpc/kernel/prom.c|   1 +
 arch/powerpc/mm/hash_utils_64.c   |  86 ++-
 arch/powerpc/mm/mmu_context_book3s64.c|  32 ++-
 arch/powerpc/mm/slb.c | 350 +-
 9 files changed, 493 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index f61cad3de4e6..5f0deeda7884 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -58,6 +58,16 @@
 #define H_VMALLOC_END  (H_VMALLOC_START + H_VMALLOC_SIZE)
 
 /*
+ * Process table with ISA 3.0 need to be mapped at the beginning of a 1TB 
segment
+ * We put that in the top of VMALLOC region. For each region we can go upto 
64TB
+ * for now. Hence we have space to put process table there. We should not get
+ * an SLB miss for this address, because the VSID for this is placed in the
+ * partition table.
+ */
+#define H_SEG_PROC_TBL_START   ASM_CONST(0xD0002000)
+#define H_SEG_PROC_TBL_END ASM_CONST(0xD00020ff)
+
+/*
  * Region IDs
  */
 #define REGION_SHIFT   60UL
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index b042e5f9a428..5f9ee699da5f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -102,6 +102,18 @@
 #define HPTE_V_1TB_SEG ASM_CONST(0x4000)
 #define HPTE_V_VRMA_MASK   ASM_CONST(0x4001ff00)
 
+/* segment table entry masks/bits */
+/* Upper 64 bit */
+#define STE_VALID  ASM_CONST(0x800)
+/*
+ * lower 64 bit
+ * 64th bit become 0 bit
+ */
+/*
+ * Software defined bolted bit
+ */
+#define STE_BOLTED ASM_CONST(0x1)
+
 /* Values for PP (assumes Ks=0, Kp=1) */
 #define PP_RWXX0   /* Supervisor read/write, User none */
 #define PP_RWRX 1  /* Supervisor read/write, User read */
@@ -129,6 +141,11 @@ struct hash_pte {
__be64 r;
 };
 
+struct seg_entry {
+   __be64 ste_e;
+   __be64 ste_v;
+};
+
 extern struct hash_pte *htab_address;
 extern unsigned long htab_size_bytes;
 extern unsigned long htab_hash_mask;
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 6d8306d9aa7a..b7514f19863f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -65,6 +65,7 @@ extern struct patb_entry *partition_tb;
  */
 #define PATB_SIZE_SHIFT16
 
+extern unsigned long segment_table_initialize(struct prtb_entry *prtb);
 typedef unsigned long mm_context_id_t;
 struct spinlock;
 
@@ -94,6 +95,9 @@ typedef struct {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
struct list_head iommu_group_mem_list;
 #endif
+   unsigned long seg_table;
+   struct spinlock *seg_tbl_lock;
+
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 21b71469e66b..97446e8cc101 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -24,6 +24,10 @@
  * Radix page table available
  */
 #define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040)
+
+/* Seg table only supported for book3s 64 */
+#define MMU_FTR_TYPE_SEG_TABLE ASM_CONST(0x0080)
+
 /*
  * individual features
  */
@@ -130,7 +134,7 @@ enum {
MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE |
MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA |
 #ifdef CONFIG_PPC_RADIX_MMU
-   MMU_FTR_TYPE_RADIX |
+   MMU_FTR_TYPE_RADIX | MMU_FTR_TYPE_SEG_TABLE |
 #endif
0,
 };
diff --git a/arch/powerp

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Aneesh Kumar K.V
Darren Stevens  writes:

> Hello Christian

> That's not where I ended up with my bisect, this commit is about 10 before the
> one I found to be bad, which is:
>
> commit d6a9996e84ac4beb7713e9485f4563e100a9b03e
> Author: Aneesh Kumar K.V 
> Date:   Fri Apr 29 23:26:21 2016 +1000
>
> powerpc/mm: vmalloc abstraction in preparation for radix
> 
> The vmalloc range differs between hash and radix config. Hence make
> VMALLOC_START and related constants a variable which will be runtime
> initialized depending on whether hash or radix mode is active.
> 
> Signed-off-by: Aneesh Kumar K.V 
> [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
> Signed-off-by: Michael Ellerman 
>

Can you check the value of ISA_IO_BASE where you are
using it. If you are calling it early, you will find wrong value in
that. With the latest kernel it is a variable and is initialized in
hash__early_init_mmu();

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-08 Thread Christian Zigotzky

Hi Aneesh,

We use it only in the file "pci-common.c".

Part of the Nemo patch with ISA_IO_BASE:

diff -rupN linux-4.7/arch/powerpc/kernel/pci-common.c 
linux-4.7-nemo/arch/powerpc/kernel/pci-common.c
--- linux-4.7/arch/powerpc/kernel/pci-common.c2016-05-20 
10:23:06.588299920 +0200
+++ linux-4.7-nemo/arch/powerpc/kernel/pci-common.c2016-05-20 
10:21:28.652296699 +0200

@@ -723,6 +723,19 @@ void pci_process_bridge_OF_ranges(struct
 isa_io_base =
 (unsigned long)hose->io_base_virt;
 #endif /* CONFIG_PPC32 */
+
+
+#ifdef CONFIG_PPC_PASEMI_SB600
+   /* Workaround for lack of device tree. New for 
kernel 3.17: range.cpu_addr instead of cpu_addr and range.size instead 
of size Ch. Zigotzky */

+   if (primary) {
+   __ioremap_at(range.cpu_addr, (void 
*)ISA_IO_BASE,
+   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;

+   hose->io_base_virt = (void *)_IO_BASE;
+   /* _IO_BASE needs unsigned long long for the kernel 3.17 
Ch. Zigotzky */
+   printk("Initialised io_base_virt 0x%lx _IO_BASE 
0x%llx\n", (unsigned long)hose->io_base_virt, (unsigned long long)_IO_BASE);

+}
+#endif
+

Cheers,

Christian

On 08 June 2016 at 5:11 PM, Aneesh Kumar K.V wrote:


Can you check the value of ISA_IO_BASE where you are
using it. If you are calling it early, you will find wrong value in
that. With the latest kernel it is a variable and is initialized in
hash__early_init_mmu();

-aneesh




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 00/11] powerpc/powernv/cpuidle: Add support for POWER ISA v3 idle states

2016-06-08 Thread Shreyas B. Prabhu
POWER ISA v3 defines a new idle processor core mechanism. In summary,
 a) new instruction named stop is added. This instruction replaces
instructions like nap, sleep, rvwinkle.
 b) new per thread SPR named PSSCR is added which controls the behavior
of stop instruction. 

PSSCR has following key fields
Bits 0:3  - Power-Saving Level Status. This field indicates the
lowest power-saving state the thread entered since stop
instruction was last executed.

Bit 42 - Enable State Loss  
0 - No state is lost irrespective of other fields  
1 - Allows state loss

Bits 44:47 - Power-Saving Level Limit  
This limits the power-saving level that can be entered into.

Bits 60:63 - Requested Level  
Used to specify which power-saving level must be entered on
executing stop instruction

Stop idle states and their properties like name, latency, target
residency, psscr value are exposed via device tree.

This patch series adds support for this new mechanism.

Patches 1-7 are cleanups and code movement.
Patch 8 adds platform specific support for stop and psscr handling.
Patch 9 is a minor cleanup in cpuidle driver.
Patch 10 adds cpuidle driver support.
Patch 11 makes offlined cpu use deepest stop state.

Note: Documentation for the device tree bindings is posted here-
http://patchwork.ozlabs.org/patch/629125/


Changes in v6
=
 - Restore new POWER ISA v3 SPRS when waking up from deep idle

Changes in v5
=
 - Use generic cpuidle constant CPUIDLE_NAME_LEN
 - Fix return code handling for of_property_read_string_array
 - Use DT flags to determine if are using stop instruction, instead of
   cpu_has_feature
 - Removed uncessary cast with names
 - &stop_loop -> stop_loop
 - Added POWERNV_THRESHOLD_LATENCY_NS to filter out idle states with high 
latency

Changes in v4
=
 - Added a patch to use PNV_THREAD_WINKLE macro while requesting for winkle
 - Moved power7_powersave_common rename to more appropriate patch
 - renaming power7_enter_nap_mode to pnv_enter_arch207_idle_mode
 - Added PSSCR layout to Patch 7's commit message
 - Improved / Fixed comments
 - Fixed whitespace error in paca.h
 - Using MAX_POSSIBLE_STOP_STATE macro instead of hardcoding 0xF has
   max possible stop state

Changes in v3
=
 - Rebased on powerpc-next
 - Dropping patch 1 since we are not adding a new file for P9 idle support
 - Improved comments in multiple places
 - Moved GET_PACA from power7_restore_hyp_resource to System Reset
 - Instead of moving few functions from idle_power7 to idle_power_common,
   renaming idle_power7.S to idle_power_common.S
 - Moved HSTATE_HWTHREAD_STATE updation to power_powersave_common
 - Dropped earlier patch 5 which moved few macros from idle_power_common to
   asm/cpuidle.h. 
 - Added a patch to rename reusable power7_* idle functions to pnv_*
 - Added new patch that creates abstraction for saving SPRs before
   entering deep idle states
 - Instead of introducing new file idle_power_stop.S, P9 idle support
   is added to idle_power_common.S using CPU_FTR sections.
 - Fixed r4 reg clobbering in power_stop0

Changes in v2
=
 - Rebased on v4.6-rc6
 - Using CPU_FTR_ARCH_300 bit instead of CPU_FTR_STOP_INST

Cc: Rafael J. Wysocki 
Cc: Daniel Lezcano 
Cc: linux...@vger.kernel.org
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Michael Neuling 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Rob Herring 
Cc: Lorenzo Pieralisi 

Shreyas B. Prabhu (11):
  powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for
winkle
  powerpc/kvm: make hypervisor state restore a function
  powerpc/powernv: Rename idle_power7.S to idle_power_common.S
  powerpc/powernv: Rename reusable idle functions to hardware agnostic
names
  powerpc/powernv: Make pnv_powersave_common more generic
  powerpc/powernv: abstraction for saving SPRs before entering deep idle
states
  powerpc/powernv: set power_save func after the idle states are
initialized
  powerpc/powernv: Add platform support for stop instruction
  cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of
MAX_POWERNV_IDLE_STATES
  cpuidle/powernv: Add support for POWER ISA v3 idle states
  powerpc/powernv: Use deepest stop state when cpu is offlined

 arch/powerpc/include/asm/cpuidle.h|   2 +
 arch/powerpc/include/asm/kvm_book3s_asm.h |   2 +-
 arch/powerpc/include/asm/machdep.h|   1 +
 arch/powerpc/include/asm/opal-api.h   |  11 +-
 arch/powerpc/include/asm/paca.h   |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |   4 +
 arch/powerpc/include/asm/processor.h  |   1 +
 arch/powerpc/include/asm/reg.h|  14 +
 arch/powerpc/kernel/Makefile  |   2 +-
 arch/powerpc/kernel/asm-offsets.c |   2 +
 arch/powerpc/kernel/exceptions-64s.

[PATCH v6 02/11] powerpc/kvm: make hypervisor state restore a function

2016-06-08 Thread Shreyas B. Prabhu
In the current code, when the thread wakes up in reset vector, some
of the state restore code and check for whether a thread needs to
branch to kvm is duplicated. Reorder the code such that this
duplication is avoided.

At a higher level this is what the change looks like-

Before this patch -
power7_wakeup_tb_loss:
restore hypervisor state
if (thread needed by kvm)
goto kvm_start_guest
restore nvgprs, cr, pc
rfid to process context

power7_wakeup_loss:
restore nvgprs, cr, pc
rfid to process context

reset vector:
if (waking from deep idle states)
goto power7_wakeup_tb_loss
else
if (thread needed by kvm)
goto kvm_start_guest
goto power7_wakeup_loss

After this patch -
power7_wakeup_tb_loss:
restore hypervisor state
return

power7_restore_hyp_resource():
if (waking from deep idle states)
goto power7_wakeup_tb_loss
return

power7_wakeup_loss:
restore nvgprs, cr, pc
rfid to process context

reset vector:
power7_restore_hyp_resource()
if (thread needed by kvm)
goto kvm_start_guest
goto power7_wakeup_loss

Reviewed-by: Paul Mackerras 
Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
- No changes since v3

Changes in v3:
=
- Retaining GET_PACA(r13) in System Reset vector instead of moving it
  to power7_restore_hyp_resource
- Added comments indicating entry conditions for power7_restore_hyp_resource
- Improved comments around return statements

 arch/powerpc/kernel/exceptions-64s.S | 28 ++
 arch/powerpc/kernel/idle_power7.S| 72 +---
 2 files changed, 46 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 4c94406..4a74d6a 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -107,25 +107,9 @@ BEGIN_FTR_SECTION
beq 9f
 
cmpwi   cr3,r13,2
-
-   /*
-* Check if last bit of HSPGR0 is set. This indicates whether we are
-* waking up from winkle.
-*/
GET_PACA(r13)
-   clrldi  r5,r13,63
-   clrrdi  r13,r13,1
-   cmpwi   cr4,r5,1
-   mtspr   SPRN_HSPRG0,r13
+   bl  power7_restore_hyp_resource
 
-   lbz r0,PACA_THREAD_IDLE_STATE(r13)
-   cmpwi   cr2,r0,PNV_THREAD_NAP
-   bgt cr2,8f  /* Either sleep or Winkle */
-
-   /* Waking up from nap should not cause hypervisor state loss */
-   bgt cr3,.
-
-   /* Waking up from nap */
li  r0,PNV_THREAD_RUNNING
stb r0,PACA_THREAD_IDLE_STATE(r13)  /* Clear thread state */
 
@@ -143,13 +127,9 @@ BEGIN_FTR_SECTION
 
/* Return SRR1 from power7_nap() */
mfspr   r3,SPRN_SRR1
-   beq cr3,2f
-   b   power7_wakeup_noloss
-2: b   power7_wakeup_loss
-
-   /* Fast Sleep wakeup on PowerNV */
-8: GET_PACA(r13)
-   b   power7_wakeup_tb_loss
+   blt cr3,2f
+   b   power7_wakeup_loss
+2: b   power7_wakeup_noloss
 
 9:
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index 705c867..d5def06 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -276,6 +276,39 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66);   
\
 20:nop;
 
 
+/*
+ * Called from reset vector. Check whether we have woken up with
+ * hypervisor state loss. If yes, restore hypervisor state and return
+ * back to reset vector.
+ *
+ * r13 - Contents of HSPRG0
+ * cr3 - set to gt if waking up with partial/complete hypervisor state loss
+ */
+_GLOBAL(power7_restore_hyp_resource)
+   /*
+* Check if last bit of HSPGR0 is set. This indicates whether we are
+* waking up from winkle.
+*/
+   clrldi  r5,r13,63
+   clrrdi  r13,r13,1
+   cmpwi   cr4,r5,1
+   mtspr   SPRN_HSPRG0,r13
+
+   lbz r0,PACA_THREAD_IDLE_STATE(r13)
+   cmpwi   cr2,r0,PNV_THREAD_NAP
+   bgt cr2,power7_wakeup_tb_loss   /* Either sleep or Winkle */
+
+   /*
+* We fall through here if PACA_THREAD_IDLE_STATE shows we are waking
+* up from nap. At this stage CR3 shouldn't contains 'gt' since that
+* indicates we are waking with hypervisor state loss from nap.
+*/
+   bgt cr3,.
+
+   blr /* Return back to System Reset vector from where
+  power7_restore_hyp_resource was invoked */
+
+
 _GLOBAL(power7_wakeup_tb_loss)
ld  r2,PACATOC(r13);
ld  r1,PACAR1(r13)
@@ -284,11 +317,13 @@ _GLOBAL(power7_wakeup_tb_loss)
 * and they are restored before switching to the process context. Hence
 

[PATCH v6 04/11] powerpc/powernv: Rename reusable idle functions to hardware agnostic names

2016-06-08 Thread Shreyas B. Prabhu
Functions like power7_wakeup_loss, power7_wakeup_noloss,
power7_wakeup_tb_loss are used by POWER7 and POWER8 hardware. They can
also be used by POWER9. Hence rename these functions hardware agnostic
names.

Suggested-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes since v4

Changes in v4:
==
 - renaming power7_powersave_common to pnv_powersave_common
 - renaming power7_enter_nap_mode to pnv_enter_arch207_idle_mode

 arch/powerpc/kernel/exceptions-64s.S|  8 
 arch/powerpc/kernel/idle_power_common.S | 33 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |  4 ++--
 3 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 4a74d6a..2a123cd 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -108,7 +108,7 @@ BEGIN_FTR_SECTION
 
cmpwi   cr3,r13,2
GET_PACA(r13)
-   bl  power7_restore_hyp_resource
+   bl  pnv_restore_hyp_resource
 
li  r0,PNV_THREAD_RUNNING
stb r0,PACA_THREAD_IDLE_STATE(r13)  /* Clear thread state */
@@ -128,8 +128,8 @@ BEGIN_FTR_SECTION
/* Return SRR1 from power7_nap() */
mfspr   r3,SPRN_SRR1
blt cr3,2f
-   b   power7_wakeup_loss
-2: b   power7_wakeup_noloss
+   b   pnv_wakeup_loss
+2: b   pnv_wakeup_noloss
 
 9:
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
@@ -1269,7 +1269,7 @@ machine_check_handle_early:
GET_PACA(r13)
ld  r1,PACAR1(r13)
li  r3,PNV_THREAD_NAP
-   b   power7_enter_nap_mode
+   b   pnv_enter_arch207_idle_mode
 4:
 #endif
/*
diff --git a/arch/powerpc/kernel/idle_power_common.S 
b/arch/powerpc/kernel/idle_power_common.S
index d5def06..34dbfc9 100644
--- a/arch/powerpc/kernel/idle_power_common.S
+++ b/arch/powerpc/kernel/idle_power_common.S
@@ -1,5 +1,6 @@
 /*
- *  This file contains the power_save function for Power7 CPUs.
+ *  This file contains idle entry/exit functions for POWER7 and
+ *  POWER8 CPUs.
  *
  *  This program is free software; you can redistribute it and/or
  *  modify it under the terms of the GNU General Public License
@@ -75,7 +76,7 @@ core_idle_lock_held:
  * 0 - don't check
  * 1 - check
  */
-_GLOBAL(power7_powersave_common)
+_GLOBAL(pnv_powersave_common)
/* Use r3 to pass state nap/sleep/winkle */
/* NAP is a state loss, we create a regs frame on the
 * stack, fill it up with the state we care about and
@@ -135,14 +136,14 @@ _GLOBAL(power7_powersave_common)
LOAD_REG_IMMEDIATE(r5, MSR_IDLE)
li  r6, MSR_RI
andcr6, r9, r6
-   LOAD_REG_ADDR(r7, power7_enter_nap_mode)
+   LOAD_REG_ADDR(r7, pnv_enter_arch207_idle_mode)
mtmsrd  r6, 1   /* clear RI before setting SRR0/1 */
mtspr   SPRN_SRR0, r7
mtspr   SPRN_SRR1, r5
rfid
 
-   .globl  power7_enter_nap_mode
-power7_enter_nap_mode:
+   .globl pnv_enter_arch207_idle_mode
+pnv_enter_arch207_idle_mode:
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
/* Tell KVM we're napping */
li  r4,KVM_HWTHREAD_IN_NAP
@@ -242,19 +243,19 @@ _GLOBAL(power7_idle)
 _GLOBAL(power7_nap)
mr  r4,r3
li  r3,PNV_THREAD_NAP
-   b   power7_powersave_common
+   b   pnv_powersave_common
/* No return */
 
 _GLOBAL(power7_sleep)
li  r3,PNV_THREAD_SLEEP
li  r4,1
-   b   power7_powersave_common
+   b   pnv_powersave_common
/* No return */
 
 _GLOBAL(power7_winkle)
li  r3,PNV_THREAD_WINKLE
li  r4,1
-   b   power7_powersave_common
+   b   pnv_powersave_common
/* No return */
 
 #define CHECK_HMI_INTERRUPT\
@@ -284,7 +285,7 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66);
\
  * r13 - Contents of HSPRG0
  * cr3 - set to gt if waking up with partial/complete hypervisor state loss
  */
-_GLOBAL(power7_restore_hyp_resource)
+_GLOBAL(pnv_restore_hyp_resource)
/*
 * Check if last bit of HSPGR0 is set. This indicates whether we are
 * waking up from winkle.
@@ -296,7 +297,7 @@ _GLOBAL(power7_restore_hyp_resource)
 
lbz r0,PACA_THREAD_IDLE_STATE(r13)
cmpwi   cr2,r0,PNV_THREAD_NAP
-   bgt cr2,power7_wakeup_tb_loss   /* Either sleep or Winkle */
+   bgt cr2,pnv_wakeup_tb_loss  /* Either sleep or Winkle */
 
/*
 * We fall through here if PACA_THREAD_IDLE_STATE shows we are waking
@@ -306,10 +307,10 @@ _GLOBAL(power7_restore_hyp_resource)
bgt cr3,.
 
blr /* Return back to System Reset vector from where
-  power7_restore_hyp_resource was invoked */
+  pnv_restore_hyp_resource was invoked */
 
 
-_GLOBAL(power7_wake

[PATCH v6 03/11] powerpc/powernv: Rename idle_power7.S to idle_power_common.S

2016-06-08 Thread Shreyas B. Prabhu
idle_power7.S handles idle entry/exit for POWER7, POWER8 and in next
patch for POWER9. Rename the file to a non-hardware specific
name.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes since v3

Changes in v3:
==
 - Instead of moving few common functions from idle_power7.S to
   idle_power_common.S, renaming idle_power7.S to idle_power_common.S

 arch/powerpc/kernel/Makefile|   2 +-
 arch/powerpc/kernel/idle_power7.S   | 527 
 arch/powerpc/kernel/idle_power_common.S | 527 
 3 files changed, 528 insertions(+), 528 deletions(-)
 delete mode 100644 arch/powerpc/kernel/idle_power7.S
 create mode 100644 arch/powerpc/kernel/idle_power_common.S

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 2da380f..99116da 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -47,7 +47,7 @@ obj-$(CONFIG_PPC_BOOK3E_64)   += exceptions-64e.o 
idle_book3e.o
 obj-$(CONFIG_PPC64)+= vdso64/
 obj-$(CONFIG_ALTIVEC)  += vecemu.o
 obj-$(CONFIG_PPC_970_NAP)  += idle_power4.o
-obj-$(CONFIG_PPC_P7_NAP)   += idle_power7.o
+obj-$(CONFIG_PPC_P7_NAP)   += idle_power_common.o
 procfs-y   := proc_powerpc.o
 obj-$(CONFIG_PROC_FS)  += $(procfs-y)
 rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI)  := rtas_pci.o
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
deleted file mode 100644
index d5def06..000
--- a/arch/powerpc/kernel/idle_power7.S
+++ /dev/null
@@ -1,527 +0,0 @@
-/*
- *  This file contains the power_save function for Power7 CPUs.
- *
- *  This program is free software; you can redistribute it and/or
- *  modify it under the terms of the GNU General Public License
- *  as published by the Free Software Foundation; either version
- *  2 of the License, or (at your option) any later version.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#undef DEBUG
-
-/*
- * Use unused space in the interrupt stack to save and restore
- * registers for winkle support.
- */
-#define _SDR1  GPR3
-#define _RPR   GPR4
-#define _SPURR GPR5
-#define _PURR  GPR6
-#define _TSCR  GPR7
-#define _DSCR  GPR8
-#define _AMOR  GPR9
-#define _WORT  GPR10
-#define _WORC  GPR11
-
-/* Idle state entry routines */
-
-#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \
-   /* Magic NAP/SLEEP/WINKLE mode enter sequence */\
-   std r0,0(r1);   \
-   ptesync;\
-   ld  r0,0(r1);   \
-1: cmp cr0,r0,r0;  \
-   bne 1b; \
-   IDLE_INST;  \
-   b   .
-
-   .text
-
-/*
- * Used by threads when the lock bit of core_idle_state is set.
- * Threads will spin in HMT_LOW until the lock bit is cleared.
- * r14 - pointer to core_idle_state
- * r15 - used to load contents of core_idle_state
- */
-
-core_idle_lock_held:
-   HMT_LOW
-3: lwz r15,0(r14)
-   andi.   r15,r15,PNV_CORE_IDLE_LOCK_BIT
-   bne 3b
-   HMT_MEDIUM
-   lwarx   r15,0,r14
-   blr
-
-/*
- * Pass requested state in r3:
- * r3 - PNV_THREAD_NAP/SLEEP/WINKLE
- *
- * To check IRQ_HAPPENED in r4
- * 0 - don't check
- * 1 - check
- */
-_GLOBAL(power7_powersave_common)
-   /* Use r3 to pass state nap/sleep/winkle */
-   /* NAP is a state loss, we create a regs frame on the
-* stack, fill it up with the state we care about and
-* stick a pointer to it in PACAR1. We really only
-* need to save PC, some CR bits and the NV GPRs,
-* but for now an interrupt frame will do.
-*/
-   mflrr0
-   std r0,16(r1)
-   stdur1,-INT_FRAME_SIZE(r1)
-   std r0,_LINK(r1)
-   std r0,_NIP(r1)
-
-   /* Hard disable interrupts */
-   mfmsr   r9
-   rldicl  r9,r9,48,1
-   rotldi  r9,r9,16
-   mtmsrd  r9,1/* hard-disable interrupts */
-
-   /* Check if something happened while soft-disabled */
-   lbz r0,PACAIRQHAPPENED(r13)
-   andi.   r0,r0,~PACA_IRQ_HARD_DIS@l
-   beq 1f
-   cmpwi   cr0,r4,0
-   beq 1f
-   addir1,r1,INT_FRAME_SIZE
-   ld  r0,16(r1)
-   li  r3,0/* Return 0 (no nap) */
-   mtlrr0
-   blr
-
-1: /* We mark irqs hard disabled as this is the state we'll
-* be in when returning and we need to tell arch_local_irq_restore()
-* about it
-*/
-   li  r0,PACA_IRQ_HARD_DIS
-   stb r0,PACAIRQHAPPENED(r13)
-
-   /* We haven't lost state ... yet */
-   

[PATCH v6 01/11] powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for winkle

2016-06-08 Thread Shreyas B. Prabhu
Signed-off-by: Shreyas B. Prabhu 
---
-No changes since v4

Changes in v4
=
- New in v4

 arch/powerpc/kernel/idle_power7.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index 470ceeb..705c867 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -252,7 +252,7 @@ _GLOBAL(power7_sleep)
/* No return */
 
 _GLOBAL(power7_winkle)
-   li  r3,3
+   li  r3,PNV_THREAD_WINKLE
li  r4,1
b   power7_powersave_common
/* No return */
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 07/11] powerpc/powernv: set power_save func after the idle states are initialized

2016-06-08 Thread Shreyas B. Prabhu
pnv_init_idle_states discovers supported idle states from the
device tree and does the required initialization. Set power_save
function pointer only after this initialization is done

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
- No changes since v1

 arch/powerpc/platforms/powernv/idle.c  | 3 +++
 arch/powerpc/platforms/powernv/setup.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index fcc8b68..fbb09fb 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -285,6 +285,9 @@ static int __init pnv_init_idle_states(void)
}
 
pnv_alloc_idle_core_states();
+
+   if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED)
+   ppc_md.power_save = power7_idle;
 out_free:
kfree(flags);
 out:
diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index ee6430b..8492bbb 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -315,7 +315,7 @@ define_machine(powernv) {
.get_proc_freq  = pnv_get_proc_freq,
.progress   = pnv_progress,
.machine_shutdown   = pnv_shutdown,
-   .power_save = power7_idle,
+   .power_save = NULL,
.calibrate_decr = generic_calibrate_decr,
 #ifdef CONFIG_KEXEC
.kexec_cpu_down = pnv_kexec_cpu_down,
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 06/11] powerpc/powernv: abstraction for saving SPRs before entering deep idle states

2016-06-08 Thread Shreyas B. Prabhu
Create a function for saving SPRs before entering deep idle states.
This function can be reused for POWER9 deep idle states.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes since v3

Changes in v3:
=
 - Newly added in v3

 arch/powerpc/kernel/idle_power_common.S | 54 +++--
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/idle_power_common.S 
b/arch/powerpc/kernel/idle_power_common.S
index a8397e3..2f909a1 100644
--- a/arch/powerpc/kernel/idle_power_common.S
+++ b/arch/powerpc/kernel/idle_power_common.S
@@ -53,6 +53,36 @@
.text
 
 /*
+ * Used by threads before entering deep idle states. Saves SPRs
+ * in interrupt stack frame
+ */
+save_sprs_to_stack:
+   /*
+* Note all register i.e per-core, per-subcore or per-thread is saved
+* here since any thread in the core might wake up first
+*/
+   mfspr   r3,SPRN_SDR1
+   std r3,_SDR1(r1)
+   mfspr   r3,SPRN_RPR
+   std r3,_RPR(r1)
+   mfspr   r3,SPRN_SPURR
+   std r3,_SPURR(r1)
+   mfspr   r3,SPRN_PURR
+   std r3,_PURR(r1)
+   mfspr   r3,SPRN_TSCR
+   std r3,_TSCR(r1)
+   mfspr   r3,SPRN_DSCR
+   std r3,_DSCR(r1)
+   mfspr   r3,SPRN_AMOR
+   std r3,_AMOR(r1)
+   mfspr   r3,SPRN_WORT
+   std r3,_WORT(r1)
+   mfspr   r3,SPRN_WORC
+   std r3,_WORC(r1)
+
+   blr
+
+/*
  * Used by threads when the lock bit of core_idle_state is set.
  * Threads will spin in HMT_LOW until the lock bit is cleared.
  * r14 - pointer to core_idle_state
@@ -209,28 +239,8 @@ fastsleep_workaround_at_entry:
b   common_enter
 
 enter_winkle:
-   /*
-* Note all register i.e per-core, per-subcore or per-thread is saved
-* here since any thread in the core might wake up first
-*/
-   mfspr   r3,SPRN_SDR1
-   std r3,_SDR1(r1)
-   mfspr   r3,SPRN_RPR
-   std r3,_RPR(r1)
-   mfspr   r3,SPRN_SPURR
-   std r3,_SPURR(r1)
-   mfspr   r3,SPRN_PURR
-   std r3,_PURR(r1)
-   mfspr   r3,SPRN_TSCR
-   std r3,_TSCR(r1)
-   mfspr   r3,SPRN_DSCR
-   std r3,_DSCR(r1)
-   mfspr   r3,SPRN_AMOR
-   std r3,_AMOR(r1)
-   mfspr   r3,SPRN_WORT
-   std r3,_WORT(r1)
-   mfspr   r3,SPRN_WORC
-   std r3,_WORC(r1)
+   bl  save_sprs_to_stack
+
IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
 
 _GLOBAL(power7_idle)
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-08 Thread Shreyas B. Prabhu
POWER ISA v3 defines a new idle processor core mechanism. In summary,
 a) new instruction named stop is added. This instruction replaces
instructions like nap, sleep, rvwinkle.
 b) new per thread SPR named Processor Stop Status and Control Register
(PSSCR) is added which controls the behavior of stop instruction.

PSSCR layout:
--
| PLS | /// | SD | ESL | EC | PSLL | /// | TR | MTL | RL |
--
0  4 41   4243   44 4854   5660

PSSCR key fields:
Bits 0:3  - Power-Saving Level Status. This field indicates the lowest
power-saving state the thread entered since stop instruction was last
executed.

Bit 42 - Enable State Loss
0 - No state is lost irrespective of other fields
1 - Allows state loss

Bits 44:47 - Power-Saving Level Limit
This limits the power-saving level that can be entered into.

Bits 60:63 - Requested Level
Used to specify which power-saving level must be entered on executing
stop instruction

This patch adds support for stop instruction and PSSCR handling.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
Changes in v6
=
 - Save/restore new P9 SPRs when using deep idle states

Changes in v4:
==
 - Added PSSCR layout to commit message
 - Improved / Fixed comments
 - Fixed whitespace error in paca.h
 - Using MAX_POSSIBLE_STOP_STATE macro instead of hardcoding 0xF as 
   max possible stop state

Changes in v3:
==
 - Instead of introducing new file idle_power_stop.S, P9 idle support
   is added to idle_power_common.S using CPU_FTR sections.
 - Fixed r4 reg clobbering in power_stop0
 - Improved comments

Changes in v2:
==
 - Using CPU_FTR_ARCH_300 bit instead of CPU_FTR_STOP_INST

 arch/powerpc/include/asm/cpuidle.h|   2 +
 arch/powerpc/include/asm/kvm_book3s_asm.h |   2 +-
 arch/powerpc/include/asm/machdep.h|   1 +
 arch/powerpc/include/asm/opal-api.h   |  11 +-
 arch/powerpc/include/asm/paca.h   |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |   4 +
 arch/powerpc/include/asm/processor.h  |   1 +
 arch/powerpc/include/asm/reg.h|  14 +++
 arch/powerpc/kernel/asm-offsets.c |   2 +
 arch/powerpc/kernel/idle_power_common.S   | 175 +++---
 arch/powerpc/platforms/powernv/idle.c |  84 --
 11 files changed, 265 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index d2f99ca..3d7fc06 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -13,6 +13,8 @@
 #ifndef __ASSEMBLY__
 extern u32 pnv_fastsleep_workaround_at_entry[];
 extern u32 pnv_fastsleep_workaround_at_exit[];
+
+extern u64 pnv_first_deep_stop_state;
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 72b6225..d318d43 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -162,7 +162,7 @@ struct kvmppc_book3s_shadow_vcpu {
 
 /* Values for kvm_state */
 #define KVM_HWTHREAD_IN_KERNEL 0
-#define KVM_HWTHREAD_IN_NAP1
+#define KVM_HWTHREAD_IN_IDLE   1
 #define KVM_HWTHREAD_IN_KVM2
 
 #endif /* __ASM_KVM_BOOK3S_ASM_H__ */
diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 6bdcd0d..ae3b155 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -262,6 +262,7 @@ struct machdep_calls {
 extern void e500_idle(void);
 extern void power4_idle(void);
 extern void power7_idle(void);
+extern void power_stop0(void);
 extern void ppc6xx_idle(void);
 extern void book3e_idle(void);
 
diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 9bb8ddf..7f3f8c6 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -162,13 +162,20 @@
 
 /* Device tree flags */
 
-/* Flags set in power-mgmt nodes in device tree if
- * respective idle states are supported in the platform.
+/*
+ * Flags set in power-mgmt nodes in device tree describing
+ * idle states that are supported in the platform.
  */
+
+#define OPAL_PM_TIMEBASE_STOP  0x0002
+#define OPAL_PM_LOSE_HYP_CONTEXT   0x2000
+#define OPAL_PM_LOSE_FULL_CONTEXT  0x4000
 #define OPAL_PM_NAP_ENABLED0x0001
 #define OPAL_PM_SLEEP_ENABLED  0x0002
 #define OPAL_PM_WINKLE_ENABLED 0x0004
 #define OPAL_PM_SLEEP_ENABLED_ER1  0x0008 /* with workaround */
+#define OPAL_PM_STOP_INST_FAST 0x0010
+#define OPAL_PM_STOP_INST_DEEP 0x0020
 
 /*
  * OPAL_CONFIG_CPU_IDLE_STATE parameters
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 5

[PATCH v6 05/11] powerpc/powernv: Make pnv_powersave_common more generic

2016-06-08 Thread Shreyas B. Prabhu
pnv_powersave_common does common steps needed before entering idle
state and eventually changes MSR to MSR_IDLE and does rfid to
pnv_enter_arch207_idle_mode.

Move the updation of HSTATE_HWTHREAD_STATE to pnv_powersave_common
from pnv_enter_arch207_idle_mode and make it more generic by passing the rfid
address as a function parameter.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes since v4

Changes in v4:
==
 - Moved renaming of power7_powersave_common to earlier patch

Changes in v3:
==
 - Moved HSTATE_HWTHREAD_STATE updation to power_powersave_common

 arch/powerpc/kernel/idle_power_common.S | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/idle_power_common.S 
b/arch/powerpc/kernel/idle_power_common.S
index 34dbfc9..a8397e3 100644
--- a/arch/powerpc/kernel/idle_power_common.S
+++ b/arch/powerpc/kernel/idle_power_common.S
@@ -75,6 +75,8 @@ core_idle_lock_held:
  * To check IRQ_HAPPENED in r4
  * 0 - don't check
  * 1 - check
+ *
+ * Address to 'rfid' to in r5
  */
 _GLOBAL(pnv_powersave_common)
/* Use r3 to pass state nap/sleep/winkle */
@@ -127,28 +129,28 @@ _GLOBAL(pnv_powersave_common)
std r9,_MSR(r1)
std r1,PACAR1(r13)
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+   /* Tell KVM we're entering idle */
+   li  r4,KVM_HWTHREAD_IN_NAP
+   stb r4,HSTATE_HWTHREAD_STATE(r13)
+#endif
+
/*
 * Go to real mode to do the nap, as required by the architecture.
 * Also, we need to be in real mode before setting hwthread_state,
 * because as soon as we do that, another thread can switch
 * the MMU context to the guest.
 */
-   LOAD_REG_IMMEDIATE(r5, MSR_IDLE)
+   LOAD_REG_IMMEDIATE(r7, MSR_IDLE)
li  r6, MSR_RI
andcr6, r9, r6
-   LOAD_REG_ADDR(r7, pnv_enter_arch207_idle_mode)
mtmsrd  r6, 1   /* clear RI before setting SRR0/1 */
-   mtspr   SPRN_SRR0, r7
-   mtspr   SPRN_SRR1, r5
+   mtspr   SPRN_SRR0, r5
+   mtspr   SPRN_SRR1, r7
rfid
 
.globl pnv_enter_arch207_idle_mode
 pnv_enter_arch207_idle_mode:
-#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-   /* Tell KVM we're napping */
-   li  r4,KVM_HWTHREAD_IN_NAP
-   stb r4,HSTATE_HWTHREAD_STATE(r13)
-#endif
stb r3,PACA_THREAD_IDLE_STATE(r13)
cmpwi   cr3,r3,PNV_THREAD_SLEEP
bge cr3,2f
@@ -243,18 +245,21 @@ _GLOBAL(power7_idle)
 _GLOBAL(power7_nap)
mr  r4,r3
li  r3,PNV_THREAD_NAP
+   LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode)
b   pnv_powersave_common
/* No return */
 
 _GLOBAL(power7_sleep)
li  r3,PNV_THREAD_SLEEP
li  r4,1
+   LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode)
b   pnv_powersave_common
/* No return */
 
 _GLOBAL(power7_winkle)
li  r3,PNV_THREAD_WINKLE
li  r4,1
+   LOAD_REG_ADDR(r5, pnv_enter_arch207_idle_mode)
b   pnv_powersave_common
/* No return */
 
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 09/11] cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES

2016-06-08 Thread Shreyas B. Prabhu
Use cpuidle's CPUIDLE_STATE_MAX macro instead of powernv specific
MAX_POWERNV_IDLE_STATES.

Cc: Rafael J. Wysocki 
Cc: Daniel Lezcano 
Cc: linux...@vger.kernel.org
Suggested-by: Daniel Lezcano 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes after v5

Changes in v5
=
 - New in v5

 drivers/cpuidle/cpuidle-powernv.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index e12dc30..3a763a8 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -20,8 +20,6 @@
 #include 
 #include 
 
-#define MAX_POWERNV_IDLE_STATES8
-
 struct cpuidle_driver powernv_idle_driver = {
.name = "powernv_idle",
.owner= THIS_MODULE,
@@ -96,7 +94,7 @@ static int fastsleep_loop(struct cpuidle_device *dev,
 /*
  * States for dedicated partition case.
  */
-static struct cpuidle_state powernv_states[MAX_POWERNV_IDLE_STATES] = {
+static struct cpuidle_state powernv_states[CPUIDLE_STATE_MAX] = {
{ /* Snooze */
.name = "snooze",
.desc = "snooze",
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-08 Thread Shreyas B. Prabhu
POWER ISA v3 defines a new idle processor core mechanism. In summary,
 a) new instruction named stop is added.
 b) new per thread SPR named PSSCR is added which controls the behavior
of stop instruction.

Supported idle states and value to be written to PSSCR register to enter
any idle state is exposed via ibm,cpu-idle-state-names and
ibm,cpu-idle-state-psscr respectively. To enter an idle state,
platform provided power_stop() needs to be invoked with the appropriate
PSSCR value.

This patch adds support for this new mechanism in cpuidle powernv driver.

Cc: Rafael J. Wysocki 
Cc: Daniel Lezcano 
Cc: Rob Herring 
Cc: Lorenzo Pieralisi 
Cc: linux...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
Note: Documentation for the device tree bindings is posted here-
http://patchwork.ozlabs.org/patch/629125/

 - No changes in v6

Changes in v5
=
 - Use generic cpuidle constant CPUIDLE_NAME_LEN
 - Fix return code handling for of_property_read_string_array
 - Use DT flags to determine if are using stop instruction, instead of
   cpu_has_feature
 - Removed uncessary cast with names
 - &stop_loop -> stop_loop
 - Added POWERNV_THRESHOLD_LATENCY_NS to filter out idle states with high 
latency

 drivers/cpuidle/cpuidle-powernv.c | 71 ++-
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index 3a763a8..c74a020 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+#define POWERNV_THRESHOLD_LATENCY_NS 20
+
 struct cpuidle_driver powernv_idle_driver = {
.name = "powernv_idle",
.owner= THIS_MODULE,
@@ -27,6 +29,9 @@ struct cpuidle_driver powernv_idle_driver = {
 
 static int max_idle_state;
 static struct cpuidle_state *cpuidle_state_table;
+
+static u64 stop_psscr_table[CPUIDLE_STATE_MAX];
+
 static u64 snooze_timeout;
 static bool snooze_timeout_en;
 
@@ -91,6 +96,17 @@ static int fastsleep_loop(struct cpuidle_device *dev,
return index;
 }
 #endif
+
+static int stop_loop(struct cpuidle_device *dev,
+struct cpuidle_driver *drv,
+int index)
+{
+   ppc64_runlatch_off();
+   power_stop(stop_psscr_table[index]);
+   ppc64_runlatch_on();
+   return index;
+}
+
 /*
  * States for dedicated partition case.
  */
@@ -167,6 +183,8 @@ static int powernv_add_idle_states(void)
int nr_idle_states = 1; /* Snooze */
int dt_idle_states;
u32 *latency_ns, *residency_ns, *flags;
+   u64 *psscr_val = NULL;
+   const char *names[CPUIDLE_STATE_MAX];
int i, rc;
 
/* Currently we have snooze statically defined */
@@ -199,12 +217,41 @@ static int powernv_add_idle_states(void)
goto out_free_latency;
}
 
+   rc = of_property_read_string_array(power_mgt,
+  "ibm,cpu-idle-state-names", names,
+  dt_idle_states);
+   if (rc < 0) {
+   pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-names in 
DT\n");
+   goto out_free_latency;
+   }
+
+   /*
+* If the idle states use stop instruction, probe for psscr values
+* which are necessary to specify required stop level.
+*/
+   if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP)) {
+   psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val),
+   GFP_KERNEL);
+   rc = of_property_read_u64_array(power_mgt,
+   "ibm,cpu-idle-state-psscr",
+   psscr_val, dt_idle_states);
+   if (rc) {
+   pr_warn("cpuidle-powernv: missing 
ibm,cpu-idle-states-psscr in DT\n");
+   goto out_free_psscr;
+   }
+   }
residency_ns = kzalloc(sizeof(*residency_ns) * dt_idle_states, 
GFP_KERNEL);
rc = of_property_read_u32_array(power_mgt,
"ibm,cpu-idle-state-residency-ns", residency_ns, 
dt_idle_states);
 
for (i = 0; i < dt_idle_states; i++) {
-
+   /*
+* If an idle state has exit latency beyond
+* POWERNV_THRESHOLD_LATENCY_NS then don't use it
+* in cpu-idle.
+*/
+   if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
+   continue;
/*
 * Cpuidle accepts exit_latency and target_residency in us.
 * Use default target_residency values if f/w does not expose 
it.
@@ -216,6 +263,16 @@ static int powernv_add_idle_states(void)
powernv_states[nr_idle_states].flags = 

[PATCH v6 11/11] powerpc/powernv: Use deepest stop state when cpu is offlined

2016-06-08 Thread Shreyas B. Prabhu
If hardware supports stop state, use the deepest stop state when
the cpu is offlined.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Shreyas B. Prabhu 
---
 - No changes since v1

 arch/powerpc/platforms/powernv/idle.c| 15 +--
 arch/powerpc/platforms/powernv/powernv.h |  1 +
 arch/powerpc/platforms/powernv/smp.c |  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index bfbd359..b38cb33 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -242,6 +242,11 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
  */
 u64 pnv_first_deep_stop_state;
 
+/*
+ * Deepest stop idle state. Used when a cpu is offlined
+ */
+u64 pnv_deepest_stop_state;
+
 static int __init pnv_init_idle_states(void)
 {
struct device_node *power_mgt;
@@ -290,8 +295,11 @@ static int __init pnv_init_idle_states(void)
}
 
/*
-* Set pnv_first_deep_stop_state to the first stop level
-* to cause hypervisor state loss
+* Set pnv_first_deep_stop_state and pnv_deepest_stop_state.
+* pnv_first_deep_stop_state should be set to the first stop
+* level to cause hypervisor state loss.
+* pnv_deepest_stop_state should be set to the deepest stop
+* stop state.
 */
pnv_first_deep_stop_state = MAX_STOP_STATE;
for (i = 0; i < dt_idle_states; i++) {
@@ -300,6 +308,9 @@ static int __init pnv_init_idle_states(void)
if ((flags[i] & OPAL_PM_LOSE_FULL_CONTEXT) &&
 (pnv_first_deep_stop_state > psscr_rl))
pnv_first_deep_stop_state = psscr_rl;
+
+   if (pnv_deepest_stop_state < psscr_rl)
+   pnv_deepest_stop_state = psscr_rl;
}
}
 
diff --git a/arch/powerpc/platforms/powernv/powernv.h 
b/arch/powerpc/platforms/powernv/powernv.h
index 6dbc0a1..da7c843 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -18,6 +18,7 @@ static inline void pnv_pci_shutdown(void) { }
 #endif
 
 extern u32 pnv_get_supported_cpuidle_states(void);
+extern u64 pnv_deepest_stop_state;
 
 extern void pnv_lpc_init(void);
 
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index ad7b1a3..f69ceb6 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -182,7 +182,9 @@ static void pnv_smp_cpu_kill_self(void)
 
ppc64_runlatch_off();
 
-   if (idle_states & OPAL_PM_WINKLE_ENABLED)
+   if (cpu_has_feature(CPU_FTR_ARCH_300))
+   srr1 = power_stop(pnv_deepest_stop_state);
+   else if (idle_states & OPAL_PM_WINKLE_ENABLED)
srr1 = power7_winkle();
else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
(idle_states & OPAL_PM_SLEEP_ENABLED_ER1))
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-08 Thread Shreyas B Prabhu
Hi Ben,

Sorry for the delayed response.

On 06/06/2016 03:58 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote:
>> @@ -61,8 +72,13 @@ save_sprs_to_stack:
>>  * Note all register i.e per-core, per-subcore or per-thread is saved
>>  * here since any thread in the core might wake up first
>>  */
>> +BEGIN_FTR_SECTION
>> +   mfspr   r3,SPRN_PTCR
>> +   std r3,_PTCR(r1)
>> +FTR_SECTION_ELSE
>> mfspr   r3,SPRN_SDR1
>> std r3,_SDR1(r1)
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> 
> This is the only new SPR we care about in P9 ?
> 
After reviewing ISA again, I've identified LMRR, LMSER and ASDR also
need to be restored. I've fixed this in v6.

Thanks,
Shreyas


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-08 Thread Naveen N. Rao
On 2016/06/07 03:56PM, Alexei Starovoitov wrote:
> On Tue, Jun 07, 2016 at 07:02:23PM +0530, Naveen N. Rao wrote:
> > PPC64 eBPF JIT compiler.
> > 
> > Enable with:
> > echo 1 > /proc/sys/net/core/bpf_jit_enable
> > or
> > echo 2 > /proc/sys/net/core/bpf_jit_enable
> > 
> > ... to see the generated JIT code. This can further be processed with
> > tools/net/bpf_jit_disasm.
> > 
> > With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> > test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]
> > 
> > ... on both ppc64 BE and LE.
> 
> Nice. That's even better than on x64 which cannot jit one test:
> test_bpf: #262 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 168 PASS
> which was designed specifically to hit x64 jit pass limit.
> ppc jit has predicatble number of passes and doesn't have this problem
> as expected. Great.

Yes, that's thanks to the clever handling of conditional branches by 
Matt -- we always emit 2 instructions for this reason (encoded in 
PPC_BCC() macro).

> 
> > The details of the approach are documented through various comments in
> > the code.
> > 
> > Cc: Matt Evans 
> > Cc: Denis Kirjanov 
> > Cc: Michael Ellerman 
> > Cc: Paul Mackerras 
> > Cc: Alexei Starovoitov 
> > Cc: Daniel Borkmann 
> > Cc: "David S. Miller" 
> > Cc: Ananth N Mavinakayanahalli 
> > Signed-off-by: Naveen N. Rao 
> > ---
> >  arch/powerpc/Kconfig  |   3 +-
> >  arch/powerpc/include/asm/asm-compat.h |   2 +
> >  arch/powerpc/include/asm/ppc-opcode.h |  20 +-
> >  arch/powerpc/net/Makefile |   4 +
> >  arch/powerpc/net/bpf_jit.h|  53 +-
> >  arch/powerpc/net/bpf_jit64.h  | 102 
> >  arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
> >  arch/powerpc/net/bpf_jit_comp64.c | 956 
> > ++
> >  8 files changed, 1317 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/powerpc/net/bpf_jit64.h
> >  create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
> >  create mode 100644 arch/powerpc/net/bpf_jit_comp64.c
> 
> don't see any issues with the code.
> Thank you for working on this.
> 
> Acked-by: Alexei Starovoitov 

Thanks, Alexei!


Regards,
Naveen

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] drivers/net/fsl_ucc: Do not prefix header guard with CONFIG_

2016-06-08 Thread David Miller
From: Andreas Ziegler 
Date: Wed,  8 Jun 2016 11:40:28 +0200

> The CONFIG_ prefix should only be used for options which
> can be configured through Kconfig and not for guarding headers.
> 
> Signed-off-by: Andreas Ziegler 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 1/7] dt-bindings: Update QorIQ TMU thermal bindings

2016-06-08 Thread Rob Herring
On Tue, Jun 07, 2016 at 11:27:34AM +0800, Jia Hongtao wrote:
> For different types of SoC the sensor id and endianness may vary.
> "#thermal-sensor-cells" is used to provide sensor id information.
> "little-endian" property is to tell the endianness of TMU.
> 
> Signed-off-by: Jia Hongtao 
> ---
> Changes for V2:
> * Remove formatting chnages.
> 
>  Documentation/devicetree/bindings/thermal/qoriq-thermal.txt | 7 +++
>  1 file changed, 7 insertions(+)

Acked-by: Rob Herring 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer

2016-06-08 Thread Bjorn Helgaas
On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.

I'm not convinced yet that the extra resource pointer is necessary.

Microblaze does look up the resource in pci_mmap_page_range(), but it
never actually uses it.  It *looks* like it uses it, but that code is
actually dead and I think we should apply the first patch below.

That leaves powerpc as the only arch that would use this extra
resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
decide whether to make a normal uncacheable mapping or a write-
combining one.  There's nothing here that's specific to the powerpc
architecture, and I don't think we should add this parameter just to
cater to powerpc.

There are two cases where __pci_mmap_set_pgprot() on powerpc does
something based on the resource:

  1) We're using procfs to mmap I/O port space after we requested
 write-combining, e.g., we did this:

   ioctl(fd, PCIIOC_MMAP_IS_IO);   # request I/O port space
   ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining
   mmap(fd, ...)

 On powerpc, we ignore the write-combining request in this case.

 I think we can handle this case by applying the second patch
 below to ignore write-combining on I/O space for all arches, not
 just powerpc.

  2) We're using sysfs to mmap resourceN (not resourceN_wc), and
 the resource is prefetchable.  On powerpc, we turn *on*
 write-combining, even though the user didn't ask for it.

 I'm not sure this case is actually safe, because it changes the
 ordering properties.  If it *is* safe, we could enable write-
 combining in pci_mmap_resource(), where we already have the
 resource and it could be done for all arches.

 This case is not strictly necessary, except to avoid a
 performance regression, because the user could have mapped
 resourceN_wc to explicitly request write-combining.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d319a9c..5bbe20c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, 
> struct bin_attribute *attr,
>   pci_resource_to_user(pdev, i, res, &start, &end);
>   vma->vm_pgoff += start >> PAGE_SHIFT;
>   mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io;
> - return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
> + return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine);
>  }
>  
>  static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj,
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..f19ee2a 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct 
> vm_area_struct *vma)
>   if (i >= PCI_ROM_RESOURCE)
>   return -ENODEV;
>  
> - ret = pci_mmap_page_range(dev, vma,
> + ret = pci_mmap_page_range(dev, &dev->resource[i], vma,
> fpriv->mmap_state,
> fpriv->write_combine);
>   if (ret < 0)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..3c1a0f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -70,6 +70,12 @@ enum pci_mmap_state {
>   pci_mmap_mem
>  };
>  
> +struct vm_area_struct;
> +/* Map a range of PCI memory or I/O space for a device into user space */
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> + struct vm_area_struct *vma,
> + enum pci_mmap_state mmap_state, int write_combine);
> +
>  /*
>   *  For PCI devices, the region numbers are assigned this way:
>   */


commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
Author: Bjorn Helgaas 
Date:   Wed Jun 8 14:00:14 2016 -0500

microblaze/PCI: Remove useless __pci_mmap_set_pgprot()

The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
where it computes either an uncacheable pgprot_t or a write-combining one.
But on microblaze, we always use the regular uncacheable pgprot_t.

Remove the useless code in __pci_mmap_set_pgprot() and inline the
pgprot_noncached() at the only caller.

Signed-off-by: Bjorn Helgaas 

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 14cba60..1974567 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct 
pci_dev *dev,
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *

Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-08 Thread Michael Neuling
On Wed, 2016-06-08 at 22:31 +0530, Shreyas B Prabhu wrote:
> Hi Ben,
> 
> Sorry for the delayed response.
> 
> On 06/06/2016 03:58 AM, Benjamin Herrenschmidt wrote:
> > 
> > On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote:
> > > 
> > > @@ -61,8 +72,13 @@ save_sprs_to_stack:
> > >  * Note all register i.e per-core, per-subcore or per-thread
> > > is saved
> > >  * here since any thread in the core might wake up first
> > >  */
> > > +BEGIN_FTR_SECTION
> > > +   mfspr   r3,SPRN_PTCR
> > > +   std r3,_PTCR(r1)
> > > +FTR_SECTION_ELSE
> > > mfspr   r3,SPRN_SDR1
> > > std r3,_SDR1(r1)
> > > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> > This is the only new SPR we care about in P9 ?
> > 
> After reviewing ISA again, I've identified LMRR, LMSER and ASDR also
> need to be restored. I've fixed this in v6.

LMRR and LMSER are used the load monitored patch set.  There they will get
restored when we context switch back to userspace.  It probably doesn't
hurt that much but you don't need to restore them here. 

They are not used in the kernel.

It escapes me what ASDR is right now.

Mikey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer

2016-06-08 Thread Yinghai Lu
On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas  wrote:
>
> Microblaze does look up the resource in pci_mmap_page_range(), but it
> never actually uses it.  It *looks* like it uses it, but that code is
> actually dead and I think we should apply the first patch below.

Good one.

>
> That leaves powerpc as the only arch that would use this extra
> resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
> decide whether to make a normal uncacheable mapping or a write-
> combining one.  There's nothing here that's specific to the powerpc
> architecture, and I don't think we should add this parameter just to
> cater to powerpc.
>
> There are two cases where __pci_mmap_set_pgprot() on powerpc does
> something based on the resource:
>
>   1) We're using procfs to mmap I/O port space after we requested
>  write-combining, e.g., we did this:
>
>ioctl(fd, PCIIOC_MMAP_IS_IO);   # request I/O port space
>ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining
>mmap(fd, ...)
>
>  On powerpc, we ignore the write-combining request in this case.
>
>  I think we can handle this case by applying the second patch
>  below to ignore write-combining on I/O space for all arches, not
>  just powerpc.
>
>   2) We're using sysfs to mmap resourceN (not resourceN_wc), and
>  the resource is prefetchable.  On powerpc, we turn *on*
>  write-combining, even though the user didn't ask for it.
>
>  I'm not sure this case is actually safe, because it changes the
>  ordering properties.  If it *is* safe, we could enable write-
>  combining in pci_mmap_resource(), where we already have the
>  resource and it could be done for all arches.
>
>  This case is not strictly necessary, except to avoid a
>  performance regression, because the user could have mapped
>  resourceN_wc to explicitly request write-combining.
>

Agreed.

>
> commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
> Author: Bjorn Helgaas 
> Date:   Wed Jun 8 14:00:14 2016 -0500
>
> microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
>
> The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
> where it computes either an uncacheable pgprot_t or a write-combining one.
> But on microblaze, we always use the regular uncacheable pgprot_t.
>
> Remove the useless code in __pci_mmap_set_pgprot() and inline the
> pgprot_noncached() at the only caller.
>
> Signed-off-by: Bjorn Helgaas 
>
> diff --git a/arch/microblaze/pci/pci-common.c 
> b/arch/microblaze/pci/pci-common.c
> index 14cba60..1974567 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct 
> pci_dev *dev,
>  }
>
>  /*
> - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
> - * device mapping.
> - */
> -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource 
> *rp,
> - pgprot_t protection,
> - enum pci_mmap_state mmap_state,
> - int write_combine)
> -{
> -   pgprot_t prot = protection;
> -
> -   /* Write combine is always 0 on non-memory space mappings. On
> -* memory space, if the user didn't pass 1, we check for a
> -* "prefetchable" resource. This is a bit hackish, but we use
> -* this to workaround the inability of /sysfs to provide a write
> -* combine bit
> -*/
> -   if (mmap_state != pci_mmap_mem)
> -   write_combine = 0;
> -   else if (write_combine == 0) {
> -   if (rp->flags & IORESOURCE_PREFETCH)
> -   write_combine = 1;
> -   }
> -
> -   return pgprot_noncached(prot);
> -}
> -
> -/*
>   * This one is used by /dev/mem and fbdev who have no clue about the
>   * PCI device, it tries to find the PCI device first and calls the
>   * above routine
> @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct 
> vm_area_struct *vma,
> return -EINVAL;
>
> vma->vm_pgoff = offset >> PAGE_SHIFT;
> -   vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
> - vma->vm_page_prot,
> - mmap_state, write_combine);
> +   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>vma->vm_end - vma->vm_start, 
> vma->vm_page_prot);
>

Acked-by: Yinghai Lu 

>
>
> commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
> Author: Bjorn Helgaas 
> Date:   Wed Jun 8 14:46:54 2016 -0500
>
> PCI: Ignore write-combining when mapping I/O port space
>
> PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
> support operations like this:
>
>   ioctl(fd, PCIIOC_MMAP_IS_IO)

[PATCH v3 3/7] powerpc: use the new LED disk activity trigger

2016-06-08 Thread Stephan Linz
- dts: rename 'ide-disk' to 'disk-activity'
- defconfig: rename 'ADB_PMU_LED_IDE' to 'ADB_PMU_LED_DISK'

Cc: Joseph Jezak 
Cc: Nico Macrionitis 
Cc: Jörg Sommer 
Signed-off-by: Stephan Linz 
---
 arch/powerpc/boot/dts/mpc8315erdb.dts |  2 +-
 arch/powerpc/boot/dts/mpc8377_rdb.dts |  2 +-
 arch/powerpc/boot/dts/mpc8378_rdb.dts |  2 +-
 arch/powerpc/boot/dts/mpc8379_rdb.dts |  2 +-
 arch/powerpc/configs/pmac32_defconfig |  2 +-
 arch/powerpc/configs/ppc6xx_defconfig |  2 +-
 drivers/macintosh/Kconfig | 13 ++---
 drivers/macintosh/via-pmu-led.c   |  4 ++--
 8 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/boot/dts/mpc8315erdb.dts 
b/arch/powerpc/boot/dts/mpc8315erdb.dts
index 4354684..ca5139e 100644
--- a/arch/powerpc/boot/dts/mpc8315erdb.dts
+++ b/arch/powerpc/boot/dts/mpc8315erdb.dts
@@ -472,7 +472,7 @@
 
hdd {
gpios = <&mcu_pio 1 0>;
-   linux,default-trigger = "ide-disk";
+   linux,default-trigger = "disk-activity";
};
};
 };
diff --git a/arch/powerpc/boot/dts/mpc8377_rdb.dts 
b/arch/powerpc/boot/dts/mpc8377_rdb.dts
index 2b4b653..e326139 100644
--- a/arch/powerpc/boot/dts/mpc8377_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8377_rdb.dts
@@ -496,7 +496,7 @@
 
hdd {
gpios = <&mcu_pio 1 0>;
-   linux,default-trigger = "ide-disk";
+   linux,default-trigger = "disk-activity";
};
};
 };
diff --git a/arch/powerpc/boot/dts/mpc8378_rdb.dts 
b/arch/powerpc/boot/dts/mpc8378_rdb.dts
index 74b6a53..71842fc 100644
--- a/arch/powerpc/boot/dts/mpc8378_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8378_rdb.dts
@@ -480,7 +480,7 @@
 
hdd {
gpios = <&mcu_pio 1 0>;
-   linux,default-trigger = "ide-disk";
+   linux,default-trigger = "disk-activity";
};
};
 };
diff --git a/arch/powerpc/boot/dts/mpc8379_rdb.dts 
b/arch/powerpc/boot/dts/mpc8379_rdb.dts
index 3b5cbac..e442a29 100644
--- a/arch/powerpc/boot/dts/mpc8379_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8379_rdb.dts
@@ -446,7 +446,7 @@
 
hdd {
gpios = <&mcu_pio 1 0>;
-   linux,default-trigger = "ide-disk";
+   linux,default-trigger = "disk-activity";
};
};
 };
diff --git a/arch/powerpc/configs/pmac32_defconfig 
b/arch/powerpc/configs/pmac32_defconfig
index ea8705f..3f6c9a6 100644
--- a/arch/powerpc/configs/pmac32_defconfig
+++ b/arch/powerpc/configs/pmac32_defconfig
@@ -158,7 +158,7 @@ CONFIG_ADB=y
 CONFIG_ADB_CUDA=y
 CONFIG_ADB_PMU=y
 CONFIG_ADB_PMU_LED=y
-CONFIG_ADB_PMU_LED_IDE=y
+CONFIG_ADB_PMU_LED_DISK=y
 CONFIG_PMAC_APM_EMU=m
 CONFIG_PMAC_MEDIABAY=y
 CONFIG_PMAC_BACKLIGHT=y
diff --git a/arch/powerpc/configs/ppc6xx_defconfig 
b/arch/powerpc/configs/ppc6xx_defconfig
index 99ccbeba..1dde0be 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -442,7 +442,7 @@ CONFIG_ADB=y
 CONFIG_ADB_CUDA=y
 CONFIG_ADB_PMU=y
 CONFIG_ADB_PMU_LED=y
-CONFIG_ADB_PMU_LED_IDE=y
+CONFIG_ADB_PMU_LED_DISK=y
 CONFIG_PMAC_APM_EMU=y
 CONFIG_PMAC_MEDIABAY=y
 CONFIG_PMAC_BACKLIGHT=y
diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig
index 3e8b29e..d28690f 100644
--- a/drivers/macintosh/Kconfig
+++ b/drivers/macintosh/Kconfig
@@ -96,19 +96,18 @@ config ADB_PMU_LED
  Support the front LED on Power/iBooks as a generic LED that can
  be triggered by any of the supported triggers. To get the
  behaviour of the old CONFIG_BLK_DEV_IDE_PMAC_BLINK, select this
- and the ide-disk LED trigger and configure appropriately through
- sysfs.
+ and the disk LED trigger and configure appropriately through sysfs.
 
-config ADB_PMU_LED_IDE
-   bool "Use front LED as IDE LED by default"
+config ADB_PMU_LED_DISK
+   bool "Use front LED as DISK LED by default"
depends on ADB_PMU_LED
depends on LEDS_CLASS
depends on IDE_GD_ATA
select LEDS_TRIGGERS
-   select LEDS_TRIGGER_IDE_DISK
+   select LEDS_TRIGGER_DISK
help
- This option makes the front LED default to the IDE trigger
- so that it blinks on IDE activity.
+ This option makes the front LED default to the disk trigger
+ so that it blinks on disk activity.
 
 config PMAC_SMU
bool "Support for SMU  based PowerMacs"
diff --git a/drivers/macintosh/via-pmu-led.c b/drivers/macintosh/via-pmu-led.c
index 19c3718..ae067ab 100644
--- a/drivers/macintosh/via-pmu-led.c
+++ b/drivers/macintosh/via-pmu-led.c
@@ -73,8 +73,8 @@ static void pmu_led_set(struct led_classdev *led_cdev,
 
 static struct led_classdev pmu_led = {
.name = "pmu-led::front",
-#ifdef CONFIG_ADB_PMU_LED_IDE
-   .default_trigger = "ide-disk",
+#ifdef CONFIG_ADB_

Re: [RFC] Implementing HUGEPAGE on MPC 8xx

2016-06-08 Thread Dan Malek

Hello Christophe.

I’m surprised there is still any interest in this processor family :)

On Jun 8, 2016, at 12:03 AM, Christophe Leroy  wrote:

> MPC 8xx has several page sizes: 4k, 16k, 512k and 8M.
> Today, 4k and 16k sizes are implemented as normal page sizes and 8M is used 
> for mapping linear memory space in kernel.
> 
> I'd like to implement HUGE PAGE to reduce TLB misses from user apps.

My original plan was to implement the TLB miss handler in three lines of code.  
I haven’t investigated recently, but I know the amount of code has grown 
substantially :)

> In 4k mode, PAGE offset is 12 bits, PTE offset is 10 bits and PGD offset is 
> 10 bits
> In 16k mode, PAGE offset is 14 bits, PTE offset is 12 bits and PGD offset is 
> 6 bits

Since the 8xx systems typically have rather small real memory, I was 
considering a combination of 4k and 512k pages as an attempt to maximize real 
memory utilization.  The 4k pages in the PTE tables as today, and the 512k 
flagged in the PGD and just loaded from there.  I don’t know if 16k is a big 
enough win (unless it’s the “standard” page size to keep TLBmiss as simple as 
possible), or if 8M is terribly useful from user space.

> From your point of view, what would be the best approach to extend support of 
> HUGE PAGES to PPC_8xx ?
> Would the good starting point be to implement a hugepagetlb-8xx.c from 
> hugepagetlb-book3e.c ?

I guess that is the place to start.  When I first thought about this many years 
ago, I was hoping to map shared libraries and properly behaving programs.  The 
mechanism I considered to do this was either inspection of the section headers, 
using some section flags, or maybe Aux Vector  to set up mmap() to hugetlb at 
run-time.

Good Luck.

— Dan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer

2016-06-08 Thread Yinghai Lu
On Wed, Jun 8, 2016 at 3:35 PM, Yinghai Lu  wrote:

> At the same time, can you kill __pci_mmap_set_pgprot() for powerpc.

Can you please put your two patches and this attached one into to pci/next?

Then I could send updated PCI: Let pci_mmap_page_range() take resource address.

Thanks

Yinghai
From: Bjorn Helgaas 
Subject: [PATCH] powerpc/PCI: Remove __pci_mmap_set_pgprot()

  PCI: Ignore write-combining when mapping I/O port space
already handle the io port mmap path.

For mmio mmap path, caller should state that correctly if write_combine
is really needed.

via proc path it should look like:
  mmap(fd, ...)   # default is I/O, non-combining
  ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining
  ioctl(fd, PCIIOC_MMAP_IS_MEM);  # request memory space
  mmap(fd, ...)

sysfs path, it should use resource]?]_wc.

Signed-off-by: Bjorn Helgaas 

---
 arch/powerpc/kernel/pci-common.c |   37 -
 1 file changed, 4 insertions(+), 33 deletions(-)

Index: linux-2.6/arch/powerpc/kernel/pci-common.c
===
--- linux-2.6.orig/arch/powerpc/kernel/pci-common.c
+++ linux-2.6/arch/powerpc/kernel/pci-common.c
@@ -356,36 +356,6 @@ static struct resource *__pci_mmap_make_
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-  pgprot_t protection,
-  enum pci_mmap_state mmap_state,
-  int write_combine)
-{
-
-	/* Write combine is always 0 on non-memory space mappings. On
-	 * memory space, if the user didn't pass 1, we check for a
-	 * "prefetchable" resource. This is a bit hackish, but we use
-	 * this to workaround the inability of /sysfs to provide a write
-	 * combine bit
-	 */
-	if (mmap_state != pci_mmap_mem)
-		write_combine = 0;
-	else if (write_combine == 0) {
-		if (rp->flags & IORESOURCE_PREFETCH)
-			write_combine = 1;
-	}
-
-	/* XXX would be nice to have a way to ask for write-through */
-	if (write_combine)
-		return pgprot_noncached_wc(protection);
-	else
-		return pgprot_noncached(protection);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *
 		return -EINVAL;
 
 	vma->vm_pgoff = offset >> PAGE_SHIFT;
-	vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-		  vma->vm_page_prot,
-		  mmap_state, write_combine);
+	if (write_combine)
+		vma->vm_page_prot = pgprot_noncached_wc(vma->vm_page_prot);
+	else
+		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			   vma->vm_end - vma->vm_start, vma->vm_page_prot);
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/nohash: Fix build break with 4K pages

2016-06-08 Thread Michael Ellerman
On Wed, 2016-06-08 at 20:19 +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman  writes:
> 
> > Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are
> > allocating fragments" renamed page_table_free() to pte_fragment_free().
> > One occurrence was mistyped as pte_fragment_fre().
> > 
> > This only breaks the nohash 4K page build, which is not the default or
> > enabled in any defconfig.
> 
> Can you share the .config. I will add it to the build test.

It was a randconfig, it still doesn't build even with this patch:

  http://kisskb.ellerman.id.au/kisskb/buildresult/12705111/

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC] Implementing HUGEPAGE on MPC 8xx

2016-06-08 Thread Scott Wood
On Wed, 2016-06-08 at 09:03 +0200, Christophe Leroy wrote:
> In see in the current ppc kernel that for PPC32, SYS_SUPPORTS_HUGETLBFS 
> is selected only if we have PHYS_64BIT.
> What is the reason for only implementing HUGETLBFS with 64 bits phys 
> addresses ?

That's not for PPC32 in general -- it's for 32-bit FSL Book E.  The reason for
the limitation is that there are separate TLB miss handlers depending on
whether PHYS_64BIT is enabled, and we didn't want to have to implement hugetlb
support in both of them unless there was actual demand for it.

>  From your point of view, what would be the best approach to extend 
> support of HUGE PAGES to PPC_8xx ?
> Would the good starting point be to implement a hugepagetlb-8xx.c from 
> hugepagetlb-book3e.c ?

Yes.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/5] selftests/powerpc: Check for VSX preservation across userspace preemption

2016-06-08 Thread Daniel Axtens
Yay for tests!

I have a few minor nits, and one more major one (rc == 2 below).

> +/*
> + * Copyright 2015, Cyril Bur, IBM Corp.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
I realise this is well past a lost cause by now, but isn't the idea to
be version 2, not version 2 or later?

> +
> +#include "../basic_asm.h"
> +#include "../vsx_asm.h"
> +

Some of your other functions start with a comment. That would be super
helpful here - I'm still not super comfortable I understand the calling
convention. 
> +FUNC_START(check_vsx)
> + PUSH_BASIC_STACK(32)
> + std r3,STACK_FRAME_PARAM(0)(sp)
> + addi r3, r3, 16 * 12 #Second half of array
> + bl store_vsx
> + ld r3,STACK_FRAME_PARAM(0)(sp)
> + bl vsx_memcmp
> + POP_BASIC_STACK(32)
> + blr
> +FUNC_END(check_vsx)
> +



> +long vsx_memcmp(vector int *a) {
> + vector int zero = {0,0,0,0};
> + int i;
> +
> + FAIL_IF(a != varray);
> +
> + for(i = 0; i < 12; i++) {
> + if (memcmp(&a[i + 12], &zero, 16) == 0) {
> + fprintf(stderr, "Detected zero from the VSX reg %d\n", 
> i + 12);
> + return 1;
> + }
> + }
> +
> + if (memcmp(a, &a[12], 12 * 16)) {
I'm somewhat confused as to how this comparison works. You're comparing
the new saved ones to the old saved ones, yes?
> + long *p = (long *)a;
> + fprintf(stderr, "VSX mismatch\n");
> + for (i = 0; i < 24; i=i+2)
> + fprintf(stderr, "%d: 0x%08lx%08lx | 0x%08lx%08lx\n",
> + i/2 + i%2 + 20, p[i], p[i + 1], p[i + 
> 24], p[i + 25]);
> + return 1;
> + }
> + return 0;
> +}
> +
> +void *preempt_vsx_c(void *p)
> +{
> + int i, j;
> + long rc;
> + srand(pthread_self());
> + for (i = 0; i < 12; i++)
> + for (j = 0; j < 4; j++) {
> + varray[i][j] = rand();
> + /* Don't want zero because it hides kernel problems */
> + if (varray[i][j] == 0)
> + j--;
> + }
> + rc = preempt_vsx(varray, &threads_starting, &running);
> + if (rc == 2)
How would rc == 2? AIUI, preempt_vsx returns the value of check_vsx,
which in turn returns the value of vsx_memcmp, which returns 1 or 0.

> + fprintf(stderr, "Caught zeros in VSX compares\n");
Isn't it zeros or a mismatched value?
> + return (void *)rc;
> +}
> +
> +int test_preempt_vsx(void)
> +{
> + int i, rc, threads;
> + pthread_t *tids;
> +
> + threads = sysconf(_SC_NPROCESSORS_ONLN) * THREAD_FACTOR;
> + tids = malloc(threads * sizeof(pthread_t));
> + FAIL_IF(!tids);
> +
> + running = true;
> + threads_starting = threads;
> + for (i = 0; i < threads; i++) {
> + rc = pthread_create(&tids[i], NULL, preempt_vsx_c, NULL);
> + FAIL_IF(rc);
> + }
> +
> + setbuf(stdout, NULL);
> + /* Not really nessesary but nice to wait for every thread to start */
> + printf("\tWaiting for %d workers to start...", threads_starting);
> + while(threads_starting)
> + asm volatile("": : :"memory");
I think __sync_synchronise() might be ... more idiomatic or something?
Not super fussy.

> + printf("done\n");
> +
> + printf("\tWaiting for %d seconds to let some workers get preempted...", 
> PREEMPT_TIME);
> + sleep(PREEMPT_TIME);
> + printf("done\n");
> +
> + printf("\tStopping workers...");
> + /*
> +  * Working are checking this value every loop. In preempt_vsx 'cmpwi 
> r5,0; bne 2b'.
> +  * r5 will have loaded the value of running.
> +  */
> + running = 0;
Do you need some sort of synchronisation here? You're assuming it
eventually gets to the threads, which is of course true, but maybe it
would be a good idea to synchronise it more explicitly? Again, not super
fussy.
> + for (i = 0; i < threads; i++) {
> + void *rc_p;
> + pthread_join(tids[i], &rc_p);
> +
> + /*
> +  * Harness will say the fail was here, look at why preempt_vsx
> +  * returned
> +  */
> + if ((long) rc_p)
> + printf("oops\n");
> + FAIL_IF((long) rc_p);
> + }
> + printf("done\n");
> +
> + return 0;
> +}
> +
Regards,
Daniel


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 0/3] POWER9 Load Monitor Support

2016-06-08 Thread Michael Neuling
This patches series adds support for the POWER9 Load Monitor
instruction (ldmx) based on work from Jack Miller.

The first patch is a clean up of the FSCR handling. The second patch
adds the actual ldmx support to the kernel. The third patch is a
couple of ldmx selftests.

v7:
  - Suggestions from the "prestigious" mpe.
  - PATCH 1/3:
- Use current->thread.fscr rather than what the hardware gives us.
  - PATCH 2/3:
- Use current->thread.fscr rather than what the hardware gives us.
  - PATCH 3/3:
- no change.

v6:
  - PATCH 1/3:
- Suggestions from mpe.
- Init the FSCR using existing INIT_THREAD macro rather than
  init_fscr() function.
- Set fscr when taking DSCR exception in
  facility_unavailable_exception().
  - PATCH 2/3:
- Remove erroneous semicolons in restore_sprs().
  - PATCH 3/3:
- no change.

v5:
  - PATCH 1/3:
- Change FSCR cleanup more extensive.
  - PATCH 2/3:
- Moves FSCR_LM clearing to new init_fscr().
  - PATCH 3/3:
- Added test cases to .gitignore.
- Removed test again PPC_FEATURE2_EBB since it's not needed.
- Added parenthesis on input parameter usage for LDMX() macro.

Jack Miller (2):
  powerpc: Load Monitor Register Support
  powerpc: Load Monitor Register Tests

Michael Neuling (1):
  powerpc: Improve FSCR init and context switching
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 1/3] powerpc: Improve FSCR init and context switching

2016-06-08 Thread Michael Neuling
This fixes a few issues with FSCR init and switching.

In this patch:
powerpc: Create context switch helpers save_sprs() and restore_sprs()
Author: Anton Blanchard 
commit 152d523e6307c7152f9986a542f873b5c5863937
We moved the setting of the FSCR register from inside an
CPU_FTR_ARCH_207S section to inside just a CPU_FTR_ARCH_DSCR section.
Hence we are setting FSCR on POWER6/7 where the FSCR doesn't
exist. This is harmless but we shouldn't do it.

Also, we can simplify the FSCR context switch. We don't need to go
through the calculation involving dscr_inherit. We can just restore
what we saved last time.

Also, we currently don't explicitly init the FSCR for userspace
applications. Currently we init FSCR on boot in __init_fscr: and then
the first task inherits based on that. Currently it works but is
delicate. This adds the initial fscr value to INIT_THREAD to
explicitly set the FSCR for userspace applications and removes
__init_fscr: boot time init.

Based on patch by Jack Miller.

Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h  |  1 +
 arch/powerpc/kernel/cpu_setup_power.S | 10 --
 arch/powerpc/kernel/process.c | 12 
 arch/powerpc/kernel/traps.c   |  3 ++-
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 009fab1..1833fe9 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -347,6 +347,7 @@ struct thread_struct {
.fs = KERNEL_DS, \
.fpexc_mode = 0, \
.ppr = INIT_PPR, \
+   .fscr = FSCR_TAR | FSCR_EBB \
 }
 #endif
 
diff --git a/arch/powerpc/kernel/cpu_setup_power.S 
b/arch/powerpc/kernel/cpu_setup_power.S
index 584e119..75f98c8 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -49,7 +49,6 @@ _GLOBAL(__restore_cpu_power7)
 
 _GLOBAL(__setup_cpu_power8)
mflrr11
-   bl  __init_FSCR
bl  __init_PMU
bl  __init_hvmode_206
mtlrr11
@@ -67,7 +66,6 @@ _GLOBAL(__setup_cpu_power8)
 
 _GLOBAL(__restore_cpu_power8)
mflrr11
-   bl  __init_FSCR
bl  __init_PMU
mfmsr   r3
rldicl. r0,r3,4,63
@@ -86,7 +84,6 @@ _GLOBAL(__restore_cpu_power8)
 
 _GLOBAL(__setup_cpu_power9)
mflrr11
-   bl  __init_FSCR
bl  __init_hvmode_206
mtlrr11
beqlr
@@ -102,7 +99,6 @@ _GLOBAL(__setup_cpu_power9)
 
 _GLOBAL(__restore_cpu_power9)
mflrr11
-   bl  __init_FSCR
mfmsr   r3
rldicl. r0,r3,4,63
mtlrr11
@@ -155,12 +151,6 @@ __init_LPCR:
isync
blr
 
-__init_FSCR:
-   mfspr   r3,SPRN_FSCR
-   ori r3,r3,FSCR_TAR|FSCR_DSCR|FSCR_EBB
-   mtspr   SPRN_FSCR,r3
-   blr
-
 __init_HFSCR:
mfspr   r3,SPRN_HFSCR
ori r3,r3,HFSCR_TAR|HFSCR_TM|HFSCR_BHRB|HFSCR_PM|\
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index e2f12cb..74ea8db 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1023,18 +1023,11 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_DSCR)) {
u64 dscr = get_paca()->dscr_default;
-   u64 fscr = old_thread->fscr & ~FSCR_DSCR;
-
-   if (new_thread->dscr_inherit) {
+   if (new_thread->dscr_inherit)
dscr = new_thread->dscr;
-   fscr |= FSCR_DSCR;
-   }
 
if (old_thread->dscr != dscr)
mtspr(SPRN_DSCR, dscr);
-
-   if (old_thread->fscr != fscr)
-   mtspr(SPRN_FSCR, fscr);
}
 
if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
@@ -1045,6 +1038,9 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
if (old_thread->ebbrr != new_thread->ebbrr)
mtspr(SPRN_EBBRR, new_thread->ebbrr);
 
+   if (old_thread->fscr != new_thread->fscr)
+   mtspr(SPRN_FSCR, new_thread->fscr);
+
if (old_thread->tar != new_thread->tar)
mtspr(SPRN_TAR, new_thread->tar);
}
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 9229ba6..667cf78 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1418,7 +1418,8 @@ void facility_unavailable_exception(struct pt_regs *regs)
rd = (instword >> 21) & 0x1f;
current->thread.dscr = regs->gpr[rd];
current->thread.dscr_inherit = 1;
-   mtspr(SPRN_FSCR, value | FSCR_DSCR);
+   current->thread.fscr |= FSCR_DSCR;
+   mtspr(SPRN_FSCR, current->thread.fscr);
}
 
  

[PATCH v7 2/3] powerpc: Load Monitor Register Support

2016-06-08 Thread Michael Neuling
From: Jack Miller 

This enables new registers, LMRR and LMSER, that can trigger an EBB in
userspace code when a monitored load (via the new ldmx instruction)
loads memory from a monitored space. This facility is controlled by a
new FSCR bit, LM.

This patch disables the FSCR LM control bit on task init and enables
that bit when a load monitor facility unavailable exception is taken
for using it. On context switch, this bit is then used to determine
whether the two relevant registers are saved and restored. This is
done lazily for performance reasons.

Signed-off-by: Jack Miller 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h |  2 ++
 arch/powerpc/include/asm/reg.h   |  5 +
 arch/powerpc/kernel/process.c| 18 ++
 arch/powerpc/kernel/traps.c  |  9 +
 4 files changed, 34 insertions(+)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 1833fe9..ac7670d 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -314,6 +314,8 @@ struct thread_struct {
unsigned long   mmcr2;
unsignedmmcr0;
unsignedused_ebb;
+   unsigned long   lmrr;
+   unsigned long   lmser;
 #endif
 };
 
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a0948f4..ce44fe2 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -282,6 +282,8 @@
 #define SPRN_HRMOR 0x139   /* Real mode offset register */
 #define SPRN_HSRR0 0x13A   /* Hypervisor Save/Restore 0 */
 #define SPRN_HSRR1 0x13B   /* Hypervisor Save/Restore 1 */
+#define SPRN_LMRR  0x32D   /* Load Monitor Region Register */
+#define SPRN_LMSER 0x32E   /* Load Monitor Section Enable Register */
 #define SPRN_IC0x350   /* Virtual Instruction Count */
 #define SPRN_VTB   0x351   /* Virtual Time Base */
 #define SPRN_LDBAR 0x352   /* LD Base Address Register */
@@ -291,6 +293,7 @@
 #define SPRN_PMCR  0x374   /* Power Management Control Register */
 
 /* HFSCR and FSCR bit numbers are the same */
+#define FSCR_LM_LG 11  /* Enable Load Monitor Registers */
 #define FSCR_TAR_LG8   /* Enable Target Address Register */
 #define FSCR_EBB_LG7   /* Enable Event Based Branching */
 #define FSCR_TM_LG 5   /* Enable Transactional Memory */
@@ -300,10 +303,12 @@
 #define FSCR_VECVSX_LG 1   /* Enable VMX/VSX  */
 #define FSCR_FP_LG 0   /* Enable Floating Point */
 #define SPRN_FSCR  0x099   /* Facility Status & Control Register */
+#define   FSCR_LM  __MASK(FSCR_LM_LG)
 #define   FSCR_TAR __MASK(FSCR_TAR_LG)
 #define   FSCR_EBB __MASK(FSCR_EBB_LG)
 #define   FSCR_DSCR__MASK(FSCR_DSCR_LG)
 #define SPRN_HFSCR 0xbe/* HV=1 Facility Status & Control Register */
+#define   HFSCR_LM __MASK(FSCR_LM_LG)
 #define   HFSCR_TAR__MASK(FSCR_TAR_LG)
 #define   HFSCR_EBB__MASK(FSCR_EBB_LG)
 #define   HFSCR_TM __MASK(FSCR_TM_LG)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 74ea8db..2e22f60 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1009,6 +1009,14 @@ static inline void save_sprs(struct thread_struct *t)
 */
t->tar = mfspr(SPRN_TAR);
}
+
+   if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+   /* Conditionally save Load Monitor registers, if enabled */
+   if (t->fscr & FSCR_LM) {
+   t->lmrr = mfspr(SPRN_LMRR);
+   t->lmser = mfspr(SPRN_LMSER);
+   }
+   }
 #endif
 }
 
@@ -1044,6 +1052,16 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
if (old_thread->tar != new_thread->tar)
mtspr(SPRN_TAR, new_thread->tar);
}
+
+   if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+   /* Conditionally restore Load Monitor registers, if enabled */
+   if (new_thread->fscr & FSCR_LM) {
+   if (old_thread->lmrr != new_thread->lmrr)
+   mtspr(SPRN_LMRR, new_thread->lmrr);
+   if (old_thread->lmser != new_thread->lmser)
+   mtspr(SPRN_LMSER, new_thread->lmser);
+   }
+   }
 #endif
 }
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 667cf78..b2e434b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1376,6 +1376,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
[FSCR_TM_LG] = "TM",
[FSCR_EBB_LG] = "EBB",
[FSCR_TAR_LG] = "TAR",
+   [FSCR_LM_LG] = "LM",
};
char *facility = "unknown";
u64 value;
@@ -1433,6 +1434,14 @@ void facility_unavailable_exception(struct pt_regs *regs)
emulate_single_s

[PATCH v7 3/3] powerpc: Load Monitor Register Tests

2016-06-08 Thread Michael Neuling
From: Jack Miller 

Adds two tests. One is a simple test to ensure that the new registers
LMRR and LMSER are properly maintained. The other actually uses the
existing EBB test infrastructure to test that LMRR and LMSER behave as
documented.

Signed-off-by: Jack Miller 
Signed-off-by: Michael Neuling 
---
 tools/testing/selftests/powerpc/pmu/ebb/.gitignore |   2 +
 tools/testing/selftests/powerpc/pmu/ebb/Makefile   |   2 +-
 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c  | 143 +
 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h  |  39 ++
 .../selftests/powerpc/pmu/ebb/ebb_lmr_regs.c   |  37 ++
 tools/testing/selftests/powerpc/reg.h  |   5 +
 6 files changed, 227 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.h
 create mode 100644 tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr_regs.c

diff --git a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore 
b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
index 42bddbe..44b7df1 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
+++ b/tools/testing/selftests/powerpc/pmu/ebb/.gitignore
@@ -20,3 +20,5 @@ back_to_back_ebbs_test
 lost_exception_test
 no_handler_test
 cycles_with_mmcr2_test
+ebb_lmr
+ebb_lmr_regs
\ No newline at end of file
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile 
b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
index 8d2279c4..6b0453e 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile
+++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
@@ -14,7 +14,7 @@ TEST_PROGS := reg_access_test event_attributes_test 
cycles_test   \
 fork_cleanup_test ebb_on_child_test\
 ebb_on_willing_child_test back_to_back_ebbs_test   \
 lost_exception_test no_handler_test\
-cycles_with_mmcr2_test
+cycles_with_mmcr2_test ebb_lmr ebb_lmr_regs
 
 all: $(TEST_PROGS)
 
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c 
b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
new file mode 100644
index 000..c47ebd5
--- /dev/null
+++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb_lmr.c
@@ -0,0 +1,143 @@
+/*
+ * Copyright 2016, Jack Miller, IBM Corp.
+ * Licensed under GPLv2.
+ */
+
+#include 
+#include 
+
+#include "ebb.h"
+#include "ebb_lmr.h"
+
+#define SIZE   (32 * 1024 * 1024)  /* 32M */
+#define LM_SIZE0   /* Smallest encoding, 32M */
+
+#define SECTIONS   64  /* 1 per bit in LMSER */
+#define SECTION_SIZE   (SIZE / SECTIONS)
+#define SECTION_LONGS   (SECTION_SIZE / sizeof(long))
+
+static unsigned long *test_mem;
+
+static int lmr_count = 0;
+
+void ebb_lmr_handler(void)
+{
+   lmr_count++;
+}
+
+void ldmx_full_section(unsigned long *mem, int section)
+{
+   unsigned long *ptr;
+   int i;
+
+   for (i = 0; i < SECTION_LONGS; i++) {
+   ptr = &mem[(SECTION_LONGS * section) + i];
+   ldmx((unsigned long) &ptr);
+   ebb_lmr_reset();
+   }
+}
+
+unsigned long section_masks[] = {
+   0x8000,
+   0xFF00,
+   0x000F7000,
+   0x8001,
+   0xF0F0F0F0F0F0F0F0,
+   0x0F0F0F0F0F0F0F0F,
+   0x0
+};
+
+int ebb_lmr_section_test(unsigned long *mem)
+{
+   unsigned long *mask = section_masks;
+   int i;
+
+   for (; *mask; mask++) {
+   mtspr(SPRN_LMSER, *mask);
+   printf("Testing mask 0x%016lx\n", mfspr(SPRN_LMSER));
+
+   for (i = 0; i < 64; i++) {
+   lmr_count = 0;
+   ldmx_full_section(mem, i);
+   if (*mask & (1UL << (63 - i)))
+   FAIL_IF(lmr_count != SECTION_LONGS);
+   else
+   FAIL_IF(lmr_count);
+   }
+   }
+
+   return 0;
+}
+
+int ebb_lmr(void)
+{
+   int i;
+
+   SKIP_IF(!lmr_is_supported());
+
+   setup_ebb_handler(ebb_lmr_handler);
+
+   ebb_global_enable();
+
+   FAIL_IF(posix_memalign((void **)&test_mem, SIZE, SIZE) != 0);
+
+   mtspr(SPRN_LMSER, 0);
+
+   FAIL_IF(mfspr(SPRN_LMSER) != 0);
+
+   mtspr(SPRN_LMRR, ((unsigned long)test_mem | LM_SIZE));
+
+   FAIL_IF(mfspr(SPRN_LMRR) != ((unsigned long)test_mem | LM_SIZE));
+
+   /* Read every single byte to ensure we get no false positives */
+   for (i = 0; i < SECTIONS; i++)
+   ldmx_full_section(test_mem, i);
+
+   FAIL_IF(lmr_count != 0);
+
+   /* Turn on the first section */
+
+   mtspr(SPRN_LMSER, (1UL << 63));
+   FAIL_IF(mfspr(SPRN_LMSER) != (1UL << 63));
+
+   /* Enable LM (BESCR) */
+
+   mtspr(SPRN_BESCR, mfspr(SPRN_BESCR) | BESCR_LME);
+   FAIL_IF(!(mfspr(SPRN_BESCR) & BESCR_LME));
+
+   ldmx((unsigned long)&test_mem);
+
+   FAIL_IF(lmr_count != 1);

Re: [PATCH 1/5] selftests/powerpc: Check for VSX preservation across userspace preemption

2016-06-08 Thread Michael Ellerman
On Thu, 2016-06-09 at 11:35 +1000, Daniel Axtens wrote:
> > +/*
> > + * Copyright 2015, Cyril Bur, IBM Corp.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> I realise this is well past a lost cause by now, but isn't the idea to
> be version 2, not version 2 or later?

No.

I asked the powers that be and apparently for new code we're supposed to use v2
or later.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-08 Thread Nilay Vaish
Naveen, can you point out where in the patch you update the variable:
idx, a member of codegen_contex structure?  Somehow I am unable to
figure it out.  I can only see that we set it to 0 in the
bpf_int_jit_compile function.  Since all your test cases pass, I am
clearly overlooking something.

Thanks
Nilay

On 7 June 2016 at 08:32, Naveen N. Rao  wrote:
> PPC64 eBPF JIT compiler.
>
> Enable with:
> echo 1 > /proc/sys/net/core/bpf_jit_enable
> or
> echo 2 > /proc/sys/net/core/bpf_jit_enable
>
> ... to see the generated JIT code. This can further be processed with
> tools/net/bpf_jit_disasm.
>
> With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]
>
> ... on both ppc64 BE and LE.
>
> The details of the approach are documented through various comments in
> the code.
>
> Cc: Matt Evans 
> Cc: Denis Kirjanov 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Cc: "David S. Miller" 
> Cc: Ananth N Mavinakayanahalli 
> Signed-off-by: Naveen N. Rao 
> ---
>  arch/powerpc/Kconfig  |   3 +-
>  arch/powerpc/include/asm/asm-compat.h |   2 +
>  arch/powerpc/include/asm/ppc-opcode.h |  20 +-
>  arch/powerpc/net/Makefile |   4 +
>  arch/powerpc/net/bpf_jit.h|  53 +-
>  arch/powerpc/net/bpf_jit64.h  | 102 
>  arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
>  arch/powerpc/net/bpf_jit_comp64.c | 956 
> ++
>  8 files changed, 1317 insertions(+), 3 deletions(-)
>  create mode 100644 arch/powerpc/net/bpf_jit64.h
>  create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
>  create mode 100644 arch/powerpc/net/bpf_jit_comp64.c
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 01f7464..ee82f9a 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -128,7 +128,8 @@ config PPC
> select IRQ_FORCED_THREADING
> select HAVE_RCU_TABLE_FREE if SMP
> select HAVE_SYSCALL_TRACEPOINTS
> -   select HAVE_CBPF_JIT
> +   select HAVE_CBPF_JIT if !PPC64
> +   select HAVE_EBPF_JIT if PPC64
> select HAVE_ARCH_JUMP_LABEL
> select ARCH_HAVE_NMI_SAFE_CMPXCHG
> select ARCH_HAS_GCOV_PROFILE_ALL
> diff --git a/arch/powerpc/include/asm/asm-compat.h 
> b/arch/powerpc/include/asm/asm-compat.h
> index dc85dcb..cee3aa0 100644
> --- a/arch/powerpc/include/asm/asm-compat.h
> +++ b/arch/powerpc/include/asm/asm-compat.h
> @@ -36,11 +36,13 @@
>  #define PPC_MIN_STKFRM 112
>
>  #ifdef __BIG_ENDIAN__
> +#define LHZX_BEstringify_in_c(lhzx)
>  #define LWZX_BEstringify_in_c(lwzx)
>  #define LDX_BE stringify_in_c(ldx)
>  #define STWX_BEstringify_in_c(stwx)
>  #define STDX_BEstringify_in_c(stdx)
>  #else
> +#define LHZX_BEstringify_in_c(lhbrx)
>  #define LWZX_BEstringify_in_c(lwbrx)
>  #define LDX_BE stringify_in_c(ldbrx)
>  #define STWX_BEstringify_in_c(stwbrx)
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
> b/arch/powerpc/include/asm/ppc-opcode.h
> index fd8d640..6a77d130 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -142,9 +142,11 @@
>  #define PPC_INST_ISEL  0x7c1e
>  #define PPC_INST_ISEL_MASK 0xfc3e
>  #define PPC_INST_LDARX 0x7ca8
> +#define PPC_INST_STDCX 0x7c0001ad
>  #define PPC_INST_LSWI  0x7c0004aa
>  #define PPC_INST_LSWX  0x7c00042a
>  #define PPC_INST_LWARX 0x7c28
> +#define PPC_INST_STWCX 0x7c00012d
>  #define PPC_INST_LWSYNC0x7c2004ac
>  #define PPC_INST_SYNC  0x7c0004ac
>  #define PPC_INST_SYNC_MASK 0xfc0007fe
> @@ -211,8 +213,11 @@
>  #define PPC_INST_LBZ   0x8800
>  #define PPC_INST_LD0xe800
>  #define PPC_INST_LHZ   0xa000
> -#define PPC_INST_LHBRX 0x7c00062c
>  #define PPC_INST_LWZ   0x8000
> +#define PPC_INST_LHBRX 0x7c00062c
> +#define PPC_INST_LDBRX 0x7c000428
> +#define PPC_INST_STB   0x9800
> +#define PPC_INST_STH   0xb000
>  #define PPC_INST_STD   0xf800
>  #define PPC_INST_STDU  0xf801
>  #define PPC_INST_STW   0x9000
> @@ -221,22 +226,34 @@
>  #define PPC_INST_MTLR  0x7c0803a6
>  #define PPC_INST_CMPWI 0x2c00
>  #define PPC_INST_CMPDI 0x2c20
> +#define PPC_INST_CMPW  0x7c00
> +#define PPC_INST_CMPD  0x7c20
>  #define PPC_INST_CMPLW 0x7c40
> +#define PPC_INST_CMPLD 0x7c200040
>  #define PPC_INST_CMPLWI0x2800
> +#define PPC_INST_CMPLDI0x2820
>  #

Re: [PATCH] powerpc/mm: Use jump label to speed up radix_enabled check

2016-06-08 Thread Benjamin Herrenschmidt
On Wed, 2016-04-27 at 12:30 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt  writes:
> 
> > 
> > On Wed, 2016-04-27 at 11:00 +1000, Balbir Singh wrote:
> > > 
> > > Just basic testing across CPUs with various mm features 
> > > enabled/disabled. Just for sanity
> > I still don't think it's worth scattering the change. Either the jump
> > label works or it doesn't ... The only problem is make sure we identify
> > all the pre-boot ones but that's about it.
> > 
> There are two ways to do this. One is to follow the approach listed
> below done by Kevin, which is to do the jump_label_init early during boot and
> switch both cpu and mmu feature check to plain jump label.
> 
> http://mid.gmane.org/1440415228-8006-1-git-send-email-haoke...@gmail.com
> 
> I already found one use case of cpu_has_feature before that
> jump_label_init. In this approach we need to carefully audit all the
> cpu/mmu_has_feature calls to make sure they don't get called before
> jump_label_init. A missed conversion mean we miss a cpu/mmu feature
> check.
> 
> 
> Other option is to follow the patch I posted above, with the simple
> change of renaming mmu_feature_enabled to mmu_has_feature. So we can
> use it in early boot without really worrying about when we init jump
> label.
> 
> What do you suggest we follow ?

So I really don't like your patch, sorry :-(

It adds a whole new section "_in_c", duplicates a lot of infrastructure
somewhat differently etc... ugh.

I'd rather we follow Kevin's approach and convert all the CPU/MMU/...
feature things to static keys in C. There aren't that many that
need to be done really early on, we can audit them.

I would suggest doing:

 1- Add __mmu_has_feature/__cpu_has_feature/... which initially is
identical to the current one (or just make the current one use the __ variant).

 2- Convert selectively the early boot stuff to use __. There aren't *that*
many, I can help you audit them

 3- Add the static key version for all the non  __

Do you have time or should I look into this ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-08 Thread Sam Bobroff
On Thu, Jun 02, 2016 at 07:38:58AM -0500, Shreyas B. Prabhu wrote:

...

> +/* Power Management - PSSCR Fields */

It might be nice to give the full name of the register, as below with the FPSCR.

> +#define PSSCR_RL_MASK0x000F
> +#define PSSCR_MTL_MASK   0x00F0
> +#define PSSCR_TR_MASK0x0300
> +#define PSSCR_PSLL_MASK  0x000F
> +#define PSSCR_EC 0x0010
> +#define PSSCR_ESL0x0020
> +#define PSSCR_SD 0x0040
> +
> +
>  /* Floating Point Status and Control Register (FPSCR) Fields */

Cheers,
Sam.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

  1   2   >