date:20190128

Re: [PATCH v2] cxl: Wrap iterations over afu slices inside 'afu_list_lock'

2019-01-28 Thread Frederic Barrat


Hi Vaibhav,

2 comments below (one of which I missed on the previous iteration, sorry).


Le 26/01/2019 à 12:46, Vaibhav Jain a écrit :

Within cxl module, iteration over array 'adapter->slices' may be racy
at few points as it might be simultaneously read during an EEH and its
contents being set to NULL while driver is being unloaded or unbound
from the adapter. This might result in a NULL pointer to 'struct afu'
being de-referenced during an EEH thereby causing a kernel oops.

This patch fixes this by making sure that all access to the array
'adapter->slices' is wrapped within the context of spin-lock
'adapter->afu_list_lock'.

Signed-off-by: Vaibhav Jain 
---
Changelog:

v2:
* Fixed a wrong comparison of non-null pointer [Fred]
* Moved a call to cxl_vphb_error_detected() within a branch that
   checks for not null AFU pointer in 'adapter->slices' [Fred]
* Removed a misleading comment in code.
---
  drivers/misc/cxl/guest.c |  2 ++
  drivers/misc/cxl/pci.c   | 43 
  2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c
index 5d28d9e454f5..08f4a512afad 100644
--- a/drivers/misc/cxl/guest.c
+++ b/drivers/misc/cxl/guest.c
@@ -267,6 +267,7 @@ static int guest_reset(struct cxl *adapter)
int i, rc;
  
  	pr_devel("Adapter reset request\n");

+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
if ((afu = adapter->afu[i])) {
pci_error_handlers(afu, CXL_ERROR_DETECTED_EVENT,
@@ -283,6 +284,7 @@ static int guest_reset(struct cxl *adapter)
pci_error_handlers(afu, CXL_RESUME_EVENT, 0);
}
}
+   spin_unlock(&adapter->afu_list_lock);
return rc;
  }
  
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c

index c79ba1c699ad..ca968a889425 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1805,7 +1805,7 @@ static pci_ers_result_t cxl_vphb_error_detected(struct 
cxl_afu *afu,
/* There should only be one entry, but go through the list
 * anyway
 */
-   if (afu->phb == NULL)
+   if (afu == NULL || afu->phb == NULL)
return result;
  
  	list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) {

@@ -1832,7 +1832,8 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
  {
struct cxl *adapter = pci_get_drvdata(pdev);
struct cxl_afu *afu;
-   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET, afu_result;
+   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
+   pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET;
int i;
  
  	/* At this point, we could still have an interrupt pending.

@@ -1843,6 +1844,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
  
  	/* If we're permanently dead, give up. */

if (state == pci_channel_io_perm_failure) {
+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
/*
@@ -1851,6 +1853,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 */
cxl_vphb_error_detected(afu, state);
}
+   spin_unlock(&adapter->afu_list_lock);
return PCI_ERS_RESULT_DISCONNECT;
}
  
@@ -1932,14 +1935,19 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev,

 * * In slot_reset, free the old resources and allocate new ones.
 * * In resume, clear the flag to allow things to start.
 */
+
+   /* Make sure no one else changes the afu list */
+   spin_lock(&adapter->afu_list_lock);
+
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
  
-		afu_result = cxl_vphb_error_detected(afu, state);

-
-   cxl_context_detach_all(afu);
-   cxl_ops->afu_deactivate_mode(afu, afu->current_mode);
-   pci_deconfigure_afu(afu);
+   if (afu != NULL) {
+   afu_result = cxl_vphb_error_detected(afu, state);
+   cxl_context_detach_all(afu);
+   cxl_ops->afu_deactivate_mode(afu, afu->current_mode);
+   pci_deconfigure_afu(afu);
+   }
  
  		/* Disconnect trumps all, NONE trumps NEED_RESET */

if (afu_result == PCI_ERS_RESULT_DISCONNECT)



Thanks for moving the call to cxl_vphb_error_detected(), but now, the 
"if (afu_result == PCI_ERS_RESULT_DISCONNECT)" test looks like it should 
also be part of the "if (afu != NULL)" statement (and then you wouldn't 
have hit the warning about uninitialized afu_result). Current code would 
work, but looks awkward since there's no need to check afu_result at 
each iteration if afu is NULL.





@@ -1948,6 +19

Re: [PATCH v4] kbuild: Add support for DT binding schema checks

2019-01-28 Thread Geert Uytterhoeven

Hi Rob,

On Sun, Jan 27, 2019 at 4:00 AM Rob Herring  wrote:
> On Wed, Jan 23, 2019 at 9:33 AM Geert Uytterhoeven  
> wrote:
> > On Tue, Dec 11, 2018 at 9:24 PM Rob Herring  wrote:
> > > This adds the build infrastructure for checking DT binding schema
> > > documents and validating dts files using the binding schema.
> > >
> > > Check DT binding schema documents:
> > > make dt_binding_check
> > >
> > > Build dts files and check using DT binding schema:
> > > make dtbs_check
> > >
> > > Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to
> > > use for validation. This makes it easier to find and fix errors
> > > generated by a specific schema.
> > >
> > > Currently, the validation targets are separate from a normal build to
> > > avoid a hard dependency on the external DT schema project and because
> > > there are lots of warnings generated.
> >
> > Thanks, I'm giving this a try, and get errors like:
> >
> >   DTC arch/arm/boot/dts/emev2-kzm9d.dt.yaml
> > FATAL ERROR: No markers present in property 'cpu0' value
> >
> > and
> >
> >   DTC arch/arm64/boot/dts/renesas/r8a7795-salvator-x.dt.yaml
> > FATAL ERROR: No markers present in property 'audio_clk_a' value
> >
> > Do you have a clue?
>
> That's really strange because those aren't even properties. Are other
> dts files okay? This is the in tree dtc?
>
> The only time you should be missing markers is if you did a dts -> dts
> -> dt.yaml.

Found it: make dtbs_check doesn't play well with my local change to
add symbols for DT overlays:

--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -285,6 +285,10 @@ cmd_dt_S_dtb=
 \
 $(obj)/%.dtb.S: $(obj)/%.dtb FORCE
$(call if_changed,dt_S_dtb)

+ifeq ($(CONFIG_OF_OVERLAY),y)
+DTC_FLAGS += -@
+endif
+
 quiet_cmd_dtc = DTC $@
 cmd_dtc = mkdir -p $(dir ${dtc-tmp}) ; \
$(HOSTCC) -E $(dtc_cpp_flags) -x assembler-with-cpp -o
$(dtc-tmp) $< ; \

Do you see a way to handle that better?

Apart from a few expected issues, I'm seeing one other strange message:

arch/arm/boot/dts/sh73a0-kzm9g.dt.yaml: interrupts: [[2, 4], [3,
4]] is too long

This is the interrupts property in the adi,adxl345 node in
arch/arm/boot/dts/sh73a0-kzm9g.dts.
Apparently the check complains if more than one interrupt is listed here.
Is this a known issue?

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH] ucc_geth: Reset BQL queue when stopping device

2019-01-28 Thread Mathias Thore

After a timeout event caused by for example a broadcast storm, when
the MAC and PHY are reset, the BQL TX queue needs to be reset as
well. Otherwise, the device will exhibit severe performance issues
even after the storm has ended.

Co-authored-by: David Gounaris 
Signed-off-by: Mathias Thore 
---
 drivers/net/ethernet/freescale/ucc_geth.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
b/drivers/net/ethernet/freescale/ucc_geth.c
index c3d539e209ed..eb3e65e8868f 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1879,6 +1879,8 @@ static void ucc_geth_free_tx(struct ucc_geth_private 
*ugeth)
u16 i, j;
u8 __iomem *bd;
 
+   netdev_reset_queue(ugeth->ndev);
+
ug_info = ugeth->ug_info;
uf_info = &ug_info->uf_info;
 
-- 
2.19.2

Re: [PATCH v4] kbuild: Add support for DT binding schema checks

2019-01-28 Thread Geert Uytterhoeven

Hi Rob,

On Tue, Dec 11, 2018 at 9:24 PM Rob Herring  wrote:
> This adds the build infrastructure for checking DT binding schema
> documents and validating dts files using the binding schema.
>
> Check DT binding schema documents:
> make dt_binding_check
>
> Build dts files and check using DT binding schema:
> make dtbs_check
>
> Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to
> use for validation. This makes it easier to find and fix errors
> generated by a specific schema.
>
> Currently, the validation targets are separate from a normal build to
> avoid a hard dependency on the external DT schema project and because
> there are lots of warnings generated.
>
> Cc: Jonathan Corbet 
> Cc: Mark Rutland 
> Cc: Masahiro Yamada 
> Cc: Michal Marek 
> Cc: linux-...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> Cc: linux-kbu...@vger.kernel.org
> Signed-off-by: Rob Herring 

BTW, what are the CONFIG dependencies for this to work?
E.g. defconfig on x86_64 fails, even after enabling CONFIG_OF:

$ make dt_binding_check
  SCHEMA  Documentation/devicetree/bindings/processed-schema.yaml
  CHKDT   Documentation/devicetree/bindings/arm/primecell.yaml
  ...
  CHKDT   Documentation/devicetree/bindings/trivial-devices.yaml
make[1]: *** No rule to make target
'Documentation/devicetree/bindings/arm/primecell.example.dtb', needed
by '__build'.  Stop.

Obviously it does work for arm/arm64.

Thanks!

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] ucc_geth: Reset BQL queue when stopping device

2019-01-28 Thread Christophe Leroy


Hi,

Le 28/01/2019 à 10:07, Mathias Thore a écrit :

After a timeout event caused by for example a broadcast storm, when
the MAC and PHY are reset, the BQL TX queue needs to be reset as
well. Otherwise, the device will exhibit severe performance issues
even after the storm has ended.


What are the symptomns ?

Is this reset needed on any network driver in that case, or is it 
something particular for the ucc_geth ?
For instance, the freescale fs_enet doesn't have that reset. Should it 
have it too ?


Christophe



Co-authored-by: David Gounaris 
Signed-off-by: Mathias Thore 
---
  drivers/net/ethernet/freescale/ucc_geth.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
b/drivers/net/ethernet/freescale/ucc_geth.c
index c3d539e209ed..eb3e65e8868f 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1879,6 +1879,8 @@ static void ucc_geth_free_tx(struct ucc_geth_private 
*ugeth)
u16 i, j;
u8 __iomem *bd;
  
+	netdev_reset_queue(ugeth->ndev);

+
ug_info = ugeth->ug_info;
uf_info = &ug_info->uf_info;

Re: [PATCH] perf mem/c2c: Fix perf_mem_events to support powerpc

2019-01-28 Thread Ravi Bangoria




On 1/14/19 9:44 AM, Ravi Bangoria wrote:
> Powerpc hw does not have inbuilt latency filter (--ldlat) for mem-load
> event and, perf_mem_events by default includes ldlat=30 which is
> causing failure on powerpc. Refactor code to support perf mem/c2c on
> powerpc.
> 
> This patch depends on kernel side changes done my Madhavan:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html
> 
> Signed-off-by: Ravi Bangoria 


Arnaldo / Michael, Any thoughts?

Thanks.

Re: [PATCH] perf mem/c2c: Fix perf_mem_events to support powerpc

2019-01-28 Thread Jiri Olsa

On Mon, Jan 14, 2019 at 09:44:02AM +0530, Ravi Bangoria wrote:

SNIP

> diff --git a/tools/perf/arch/x86/util/mem-events.c 
> b/tools/perf/arch/x86/util/mem-events.c
> new file mode 100644
> index 000..5b4dcfe
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -0,0 +1,25 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "mem-events.h"
> +
> +struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> + PERF_MEM_EVENT("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "mem-loads"),
> + PERF_MEM_EVENT("ldlat-stores", "cpu/mem-stores/P", "mem-stores"),
> +};
> +
> +static char mem_loads_name[100];
> +static bool mem_loads_name__init;
> +
> +char *perf_mem_events__name(int i)
> +{
> + if (i == PERF_MEM_EVENTS__LOAD) {
> + if (!mem_loads_name__init) {
> + mem_loads_name__init = true;
> + scnprintf(mem_loads_name, sizeof(mem_loads_name),
> +   perf_mem_events[i].name,
> +   perf_mem_events__loads_ldlat);
> + }
> + return mem_loads_name;
> + }
> +
> + return (char *)perf_mem_events[i].name;
> +}
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index 93f74d8..1ffefd3 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -15,31 +15,13 @@
>  
>  unsigned int perf_mem_events__loads_ldlat = 30;
>  
> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> -
> -struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> - E("ldlat-loads","cpu/mem-loads,ldlat=%u/P", "mem-loads"),
> - E("ldlat-stores",   "cpu/mem-stores/P", "mem-stores"),
> +struct perf_mem_event __weak perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> + PERF_MEM_EVENT("ldlat-loads", "cpu/mem-loads/P", "mem-loads"),
> + PERF_MEM_EVENT("ldlat-stores", "cpu/mem-stores/P", "mem-stores"),
>  };

I dont think perf_mem_events array needs to be overloaded as well,
the perf_mem_events__name function should be enough no?

thanks,
jirka

[PATCH 00/11] Refactor exception entry on 40x/6xx/8xx

2019-01-28 Thread Christophe Leroy

This serie refactors exception entry macros for 40x, 6xx and 8xx

This serie will benefit to the implementation of CONFIG_VMAP, and also
to Ben's serie on MSR_EE.

First patch of this serie is a part of the CONFIG_THREAD_INFO_IN_TASK serie.
This avoids a conflict between the two series.

Christophe Leroy (11):
  powerpc/32: Rename THREAD_INFO to TASK_STACK
  powerpc/32: Refactor EXCEPTION entry macros for head_8xx.S and
head_32.S
  powerpc/32: Add a macro for setting MSR_RI in EXCEPTION_PROLOG_2
  powerpc/32: add CLR_MSR_WE() in EXCEPTION_PROLOG in head_32.h
  powerpc/32: add START_EXCEPTION() in head_32.h
  powerpc/32: move LOAD_MSR_KERNEL() into head_32.h and use it
  powerpc/40x: Don't use SPRN_SPRG_SCRATCH2 in EXCEPTION_PROLOG
  powerpc/40x: add exception frame marker
  powerpc/40x: Split and rename NORMAL_EXCEPTION_PROLOG
  powerpc/40x: Add EXC_XFER_TEMPLATE_CRITICAL()
  powerpc/40x: Refactor exception entry macros by using head_32.h

 arch/powerpc/kernel/asm-offsets.c|   2 +-
 arch/powerpc/kernel/entry_32.S   |  11 +---
 arch/powerpc/kernel/head_32.S| 101 ++--
 arch/powerpc/kernel/head_32.h| 124 +++
 arch/powerpc/kernel/head_40x.S   | 120 -
 arch/powerpc/kernel/head_8xx.S   | 103 ++---
 arch/powerpc/kernel/head_booke.h |   4 +-
 arch/powerpc/kernel/head_fsl_booke.S |   2 +-
 8 files changed, 167 insertions(+), 300 deletions(-)
 create mode 100644 arch/powerpc/kernel/head_32.h

-- 
2.13.3

[PATCH 01/11] powerpc/32: Rename THREAD_INFO to TASK_STACK

2019-01-28 Thread Christophe Leroy

This patch renames THREAD_INFO to TASK_STACK, because it is in fact
the offset of the pointer to the stack in task_struct so this pointer
will not be impacted by the move of THREAD_INFO.

Signed-off-by: Christophe Leroy 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/kernel/asm-offsets.c| 2 +-
 arch/powerpc/kernel/entry_32.S   | 2 +-
 arch/powerpc/kernel/head_32.S| 2 +-
 arch/powerpc/kernel/head_40x.S   | 4 ++--
 arch/powerpc/kernel/head_8xx.S   | 2 +-
 arch/powerpc/kernel/head_booke.h | 4 ++--
 arch/powerpc/kernel/head_fsl_booke.S | 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9ffc72ded73a..23456ba3410a 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -90,7 +90,7 @@ int main(void)
DEFINE(SIGSEGV, SIGSEGV);
DEFINE(NMI_MASK, NMI_MASK);
 #else
-   OFFSET(THREAD_INFO, task_struct, stack);
+   OFFSET(TASK_STACK, task_struct, stack);
DEFINE(THREAD_INFO_GAP, _ALIGN_UP(sizeof(struct thread_info), 16));
OFFSET(KSP_LIMIT, thread_struct, ksp_limit);
 #endif /* CONFIG_PPC64 */
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 0768dfd8a64e..3f83e71ae43f 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -1166,7 +1166,7 @@ ret_from_debug_exc:
mfspr   r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
-   lwz r9,THREAD_INFO-THREAD(r9)
+   lwz r9,TASK_STACK-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 05b08db3901d..9268e5e87949 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -261,7 +261,7 @@ __secondary_hold_acknowledge:
tophys(r11,r1); /* use tophys(r1) if kernel */ \
beq 1f; \
mfspr   r11,SPRN_SPRG_THREAD;   \
-   lwz r11,THREAD_INFO-THREAD(r11);\
+   lwz r11,TASK_STACK-THREAD(r11); \
addir11,r11,THREAD_SIZE;\
tophys(r11,r11);\
 1: subir11,r11,INT_FRAME_SIZE  /* alloc exc. frame */
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index b19d78410511..3088c9f29f5e 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -115,7 +115,7 @@ _ENTRY(saved_ksp_limit)
andi.   r11,r11,MSR_PR;  \
beq 1f;  \
mfspr   r1,SPRN_SPRG_THREAD;/* if from user, start at top of   */\
-   lwz r1,THREAD_INFO-THREAD(r1); /* this thread's kernel stack   */\
+   lwz r1,TASK_STACK-THREAD(r1); /* this thread's kernel stack   */\
addir1,r1,THREAD_SIZE;   \
 1: subir1,r1,INT_FRAME_SIZE;   /* Allocate an exception frame */\
tophys(r11,r1);  \
@@ -158,7 +158,7 @@ _ENTRY(saved_ksp_limit)
beq 1f;  \
/* COMING FROM USER MODE */  \
mfspr   r11,SPRN_SPRG_THREAD;   /* if from user, start at top of   */\
-   lwz r11,THREAD_INFO-THREAD(r11); /* this thread's kernel stack */\
+   lwz r11,TASK_STACK-THREAD(r11); /* this thread's kernel stack */\
 1: addir11,r11,THREAD_SIZE-INT_FRAME_SIZE; /* Alloc an excpt frm  */\
tophys(r11,r11); \
stw r10,_CCR(r11);  /* save various registers  */\
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 20cc816b3508..ca9207013579 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -142,7 +142,7 @@ instruction_counter:
tophys(r11,r1); /* use tophys(r1) if kernel */ \
beq 1f; \
mfspr   r11,SPRN_SPRG_THREAD;   \
-   lwz r11,THREAD_INFO-THREAD(r11);\
+   lwz r11,TASK_STACK-THREAD(r11); \
addir11,r11,THREAD_SIZE;\
tophys(r11,r11);\
 1: subir11,r11,INT_FRAME_SIZE  /* alloc exc. frame */
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 306e26c073a0..69e80e6d0d16 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -55,7 +55,7 @@ END_BTB_FLUSH_SECTION
beq 1f;  \
BOOKE_CLEAR_BTB(r11)\
/* if from user, start at top of this thread's kernel stack */   \
-   lwz

[PATCH 02/11] powerpc/32: Refactor EXCEPTION entry macros for head_8xx.S and head_32.S

2019-01-28 Thread Christophe Leroy

EXCEPTION_PROLOG is similar in head_8xx.S and head_32.S

This patch creates head_32.h and moves EXCEPTION_PROLOG macro
into it.

It also moves EXCEPTION() and EXC_XFER_() macros which are also
similar. For that, the 8xx needs to define a dummy DO_KVM asm macro.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.S  |  99 +--
 arch/powerpc/kernel/head_32.h  | 104 +
 arch/powerpc/kernel/head_8xx.S | 101 ++-
 3 files changed, 111 insertions(+), 193 deletions(-)
 create mode 100644 arch/powerpc/kernel/head_32.h

diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 9268e5e87949..a5efcdc78e8e 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -37,6 +37,8 @@
 #include 
 #include 
 
+#include "head_32.h"
+
 /* 601 only have IBAT; cr0.eq is set on 601 when using this macro */
 #define LOAD_BAT(n, reg, RA, RB)   \
/* see the comment for clear_bats() -- Cort */ \
@@ -242,103 +244,6 @@ __secondary_hold_spinloop:
 __secondary_hold_acknowledge:
.long   -1
 
-/*
- * Exception entry code.  This code runs with address translation
- * turned off, i.e. using physical addresses.
- * We assume sprg3 has the physical address of the current
- * task's thread_struct.
- */
-#define EXCEPTION_PROLOG   \
-   mtspr   SPRN_SPRG_SCRATCH0,r10; \
-   mtspr   SPRN_SPRG_SCRATCH1,r11; \
-   mfcrr10;\
-   EXCEPTION_PROLOG_1; \
-   EXCEPTION_PROLOG_2
-
-#define EXCEPTION_PROLOG_1 \
-   mfspr   r11,SPRN_SRR1;  /* check whether user or kernel */ \
-   andi.   r11,r11,MSR_PR; \
-   tophys(r11,r1); /* use tophys(r1) if kernel */ \
-   beq 1f; \
-   mfspr   r11,SPRN_SPRG_THREAD;   \
-   lwz r11,TASK_STACK-THREAD(r11); \
-   addir11,r11,THREAD_SIZE;\
-   tophys(r11,r11);\
-1: subir11,r11,INT_FRAME_SIZE  /* alloc exc. frame */
-
-
-#define EXCEPTION_PROLOG_2 \
-   stw r10,_CCR(r11);  /* save registers */ \
-   stw r12,GPR12(r11); \
-   stw r9,GPR9(r11);   \
-   mfspr   r10,SPRN_SPRG_SCRATCH0; \
-   stw r10,GPR10(r11); \
-   mfspr   r12,SPRN_SPRG_SCRATCH1; \
-   stw r12,GPR11(r11); \
-   mflrr10;\
-   stw r10,_LINK(r11); \
-   mfspr   r12,SPRN_SRR0;  \
-   mfspr   r9,SPRN_SRR1;   \
-   stw r1,GPR1(r11);   \
-   stw r1,0(r11);  \
-   tovirt(r1,r11); /* set new kernel sp */ \
-   li  r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions */ \
-   MTMSRD(r10);/* (except for mach check in rtas) */ \
-   stw r0,GPR0(r11);   \
-   lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
-   addir10,r10,STACK_FRAME_REGS_MARKER@l; \
-   stw r10,8(r11); \
-   SAVE_4GPRS(3, r11); \
-   SAVE_2GPRS(7, r11)
-
-/*
- * Note: code which follows this uses cr0.eq (set if from kernel),
- * r11, r12 (SRR0), and r9 (SRR1).
- *
- * Note2: once we have set r1 we are in a position to take exceptions
- * again, and we could thus set MSR:RI at that point.
- */
-
-/*
- * Exception vectors.
- */
-#define EXCEPTION(n, label, hdlr, xfer)\
-   . = n;  \
-   DO_KVM n;   \
-label: \
-   EXCEPTION_PROLOG;   \
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   xfer(n, hdlr)
-
-#define EXC_XFER_TEMPLATE(n, hdlr, trap, copyee, tfer, ret)\
-   li  r10,trap;   \
-   stw r10,_TRAP(r11); \
-   li  r10,MSR_KERNEL; \
-   copyee(r10, r9);\
-   bl  tfer;   \
-i##n:  \
-   .long   hdlr;   \
-   .long   ret
-
-#define COPY_EE(d, s)  rlwimi d,s,0,16,16
-#define NOCOPY(d, s)
-
-#define EXC_XFER_STD(n, hdlr)  \
-   EXC_XFER_TEMPLATE(n, hdlr, n, NOCOPY, transfer_to_handler_full, \
- ret_from_except_full)
-
-#define EXC_XFER_LITE(n, hdlr) \
-   EXC_XFER_TEMPLATE(n, hdlr, n+1, NOCOPY, transfer_to_handler, \
- ret_from_except)
-
-#define EXC_XFER_EE(n, hdlr)   \
-   EXC_XFER_TEMPLATE(n, hdlr, n, COPY_EE, transfer_to_handler_full, \
- ret_from_except_full)
-
-#define EXC_XFER_EE_LITE(n, hdlr)  \
-   EXC_XFER_TEMPLATE(n, hdlr, n+1, COPY_EE, transfer_to_handler, \
- ret_from_except)
-
 /* System reset */
 /* cor

[PATCH 03/11] powerpc/32: Add a macro for setting MSR_RI in EXCEPTION_PROLOG_2

2019-01-28 Thread Christophe Leroy

Setting MSR_RI applies to head_32 and head_8xx, but not to
head_40x. So in order to refactor EXCEPTION_PROLOG for 40x too,
this patch adds a macro for setting MSR_RI.

In the meantime, this gives the opportunity to make it
simpler on the 8xx as writing in SPRN_EID sets MSR_RI.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.S  | 2 ++
 arch/powerpc/kernel/head_32.h  | 7 +--
 arch/powerpc/kernel/head_8xx.S | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index a5efcdc78e8e..edc5e78faf08 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -37,6 +37,8 @@
 #include 
 #include 
 
+#define SET_AND_STORE_MSR_RI(reg)  li reg,MSR_KERNEL & ~(MSR_IR|MSR_DR); 
MTMSRD(reg)
+
 #include "head_32.h"
 
 /* 601 only have IBAT; cr0.eq is set on 601 when using this macro */
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 7356c27d2136..9e6fb9d468f0 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -4,6 +4,10 @@
 
 #include /* for STACK_FRAME_REGS_MARKER */
 
+#ifndef SET_AND_STORE_MSR_RI
+#define SET_AND_STORE_MSR_RI(reg)
+#endif
+
 /*
  * Exception entry code.  This code runs with address translation
  * turned off, i.e. using physical addresses.
@@ -44,8 +48,7 @@
stw r1,GPR1(r11);   \
stw r1,0(r11);  \
tovirt(r1,r11); /* set new kernel sp */ \
-   li  r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions */ \
-   MTMSRD(r10);/* (except for mach check in rtas) */ \
+   SET_AND_STORE_MSR_RI(r10);  /* can take exceptions */ \
stw r0,GPR0(r11);   \
lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
addir10,r10,STACK_FRAME_REGS_MARKER@l; \
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index d9c5bc48bef0..00dacec2b0d1 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -36,6 +36,8 @@
 .macro DO_KVM intno
 .endm
 
+#define SET_AND_STORE_MSR_RI(reg)  mtspr   SPRN_EID, reg
+
 #include "head_32.h"
 
 #if CONFIG_TASK_SIZE <= 0x8000 && CONFIG_PAGE_OFFSET >= 0x8000
-- 
2.13.3

[PATCH 04/11] powerpc/32: add CLR_MSR_WE() in EXCEPTION_PROLOG in head_32.h

2019-01-28 Thread Christophe Leroy

Add CLR_MSR_WE() macro to allow 40x to clear
that bit from the register containing msr value.

This is the only difference between common EXCEPTION_PROLOG
and 40x one. This patch will allow 40x to use the common one.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 9e6fb9d468f0..643dd8d34aac 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -8,6 +8,10 @@
 #define SET_AND_STORE_MSR_RI(reg)
 #endif
 
+#ifndef CLR_MSR_WE
+#define CLR_MSR_WE(reg)
+#endif
+
 /*
  * Exception entry code.  This code runs with address translation
  * turned off, i.e. using physical addresses.
@@ -49,6 +53,7 @@
stw r1,0(r11);  \
tovirt(r1,r11); /* set new kernel sp */ \
SET_AND_STORE_MSR_RI(r10);  /* can take exceptions */ \
+   CLR_MSR_WE(r9); \
stw r0,GPR0(r11);   \
lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
addir10,r10,STACK_FRAME_REGS_MARKER@l; \
-- 
2.13.3

[PATCH 05/11] powerpc/32: add START_EXCEPTION() in head_32.h

2019-01-28 Thread Christophe Leroy

Add START_EXCEPTION() in head_32.h for preparing the use
of head_32.h in head_40x.S

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 643dd8d34aac..f77f13142410 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -72,10 +72,13 @@
 /*
  * Exception vectors.
  */
-#define EXCEPTION(n, label, hdlr, xfer)\
+#defineSTART_EXCEPTION(n, label)   \
. = n;  \
DO_KVM n;   \
-label: \
+label:
+
+#define EXCEPTION(n, label, hdlr, xfer)\
+   START_EXCEPTION(n, label)   \
EXCEPTION_PROLOG;   \
addir3,r1,STACK_FRAME_OVERHEAD; \
xfer(n, hdlr)
-- 
2.13.3

[PATCH 06/11] powerpc/32: move LOAD_MSR_KERNEL() into head_32.h and use it

2019-01-28 Thread Christophe Leroy

As preparation for using head_32.h for head_40x.S, move
LOAD_MSR_KERNEL() there and use it to load r10 with MSR_KERNEL value.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/entry_32.S |  9 +
 arch/powerpc/kernel/head_32.h  | 11 ++-
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 3f83e71ae43f..24d93231665b 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -37,14 +37,7 @@
 #include 
 #include 
 
-/*
- * MSR_KERNEL is > 0x1 on 4xx/Book-E since it include MSR_CE.
- */
-#if MSR_KERNEL >= 0x1
-#define LOAD_MSR_KERNEL(r, x)  lis r,(x)@h; ori r,r,(x)@l
-#else
-#define LOAD_MSR_KERNEL(r, x)  li r,(x)
-#endif
+#include "head_32.h"
 
 /*
  * Align to 4k in order to ensure that all functions modyfing srr0/srr1
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index f77f13142410..e302afd40d0a 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -13,6 +13,15 @@
 #endif
 
 /*
+ * MSR_KERNEL is > 0x1 on 4xx/Book-E since it include MSR_CE.
+ */
+#if MSR_KERNEL >= 0x1
+#define LOAD_MSR_KERNEL(r, x)  lis r,(x)@h; ori r,r,(x)@l
+#else
+#define LOAD_MSR_KERNEL(r, x)  li r,(x)
+#endif
+
+/*
  * Exception entry code.  This code runs with address translation
  * turned off, i.e. using physical addresses.
  * We assume sprg3 has the physical address of the current
@@ -86,7 +95,7 @@
 #define EXC_XFER_TEMPLATE(n, hdlr, trap, copyee, tfer, ret)\
li  r10,trap;   \
stw r10,_TRAP(r11); \
-   li  r10,MSR_KERNEL; \
+   LOAD_MSR_KERNEL(r10, MSR_KERNEL);   \
copyee(r10, r9);\
bl  tfer;   \
 i##n:  \
-- 
2.13.3

[PATCH 07/11] powerpc/40x: Don't use SPRN_SPRG_SCRATCH2 in EXCEPTION_PROLOG

2019-01-28 Thread Christophe Leroy

Unlike said in the comment, r1 is not reused by the critical
exception handler, as it uses a dedicated critirq_ctx stack.
Decrementing r1 early is then unneeded.

Should the above be valid, the code is crap buggy anyway as
r1 gets some intermediate values that would jeopardise the
whole process (for instance after mfspr   r1,SPRN_SPRG_THREAD)

Using SPRN_SPRG_SCRATCH2 to save r1 is then not needed, r11 can be
used instead. This avoids one mtspr and one mfspr and makes the
prolog closer to what's done on 6xx and 8xx.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_40x.S | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 3088c9f29f5e..59f6f53f1ac2 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -102,23 +102,20 @@ _ENTRY(saved_ksp_limit)
  * Exception vector entry code. This code runs with address translation
  * turned off (i.e. using physical addresses). We assume SPRG_THREAD has
  * the physical address of the current task thread_struct.
- * Note that we have to have decremented r1 before we write to any fields
- * of the exception frame, since a critical interrupt could occur at any
- * time, and it will write to the area immediately below the current r1.
  */
 #define NORMAL_EXCEPTION_PROLOG
 \
mtspr   SPRN_SPRG_SCRATCH0,r10; /* save two registers to work with */\
mtspr   SPRN_SPRG_SCRATCH1,r11;  \
-   mtspr   SPRN_SPRG_SCRATCH2,r1;   \
mfcrr10;/* save CR in r10 for now  */\
mfspr   r11,SPRN_SRR1;  /* check whether user or kernel*/\
andi.   r11,r11,MSR_PR;  \
-   beq 1f;  \
-   mfspr   r1,SPRN_SPRG_THREAD;/* if from user, start at top of   */\
-   lwz r1,TASK_STACK-THREAD(r1); /* this thread's kernel stack   */\
-   addir1,r1,THREAD_SIZE;   \
-1: subir1,r1,INT_FRAME_SIZE;   /* Allocate an exception frame */\
tophys(r11,r1);  \
+   beq 1f;  \
+   mfspr   r11,SPRN_SPRG_THREAD;   /* if from user, start at top of   */\
+   lwz r11,TASK_STACK-THREAD(r11); /* this thread's kernel stack */\
+   addir11,r11,THREAD_SIZE; \
+   tophys(r11,r11); \
+1: subir11,r11,INT_FRAME_SIZE; /* Allocate an exception frame */\
stw r10,_CCR(r11);  /* save various registers  */\
stw r12,GPR12(r11);  \
stw r9,GPR9(r11);\
@@ -128,11 +125,11 @@ _ENTRY(saved_ksp_limit)
stw r12,GPR11(r11);  \
mflrr10; \
stw r10,_LINK(r11);  \
-   mfspr   r10,SPRN_SPRG_SCRATCH2;  \
mfspr   r12,SPRN_SRR0;   \
-   stw r10,GPR1(r11);   \
+   stw r1,GPR1(r11);\
mfspr   r9,SPRN_SRR1;\
-   stw r10,0(r11);  \
+   stw r1,0(r11);   \
+   tovirt(r1,r11); /* set new kernel sp */ \
rlwinm  r9,r9,0,14,12;  /* clear MSR_WE (necessary?)   */\
stw r0,GPR0(r11);\
SAVE_4GPRS(3, r11);  \
-- 
2.13.3

[PATCH 08/11] powerpc/40x: add exception frame marker

2019-01-28 Thread Christophe Leroy

This patch adds STACK_FRAME_REGS_MARKER in the stack at exception entry
in order to see interrupts in call traces as below:

[0.013964] Call Trace:
[0.014014] [c0745db0] [c007a9d4] tick_periodic.constprop.5+0xd8/0x104 
(unreliable)
[0.014086] [c0745dc0] [c007aa20] tick_handle_periodic+0x20/0x9c
[0.014181] [c0745de0] [c0009cd0] timer_interrupt+0xa0/0x264
[0.014258] [c0745e10] [c000e484] ret_from_except+0x0/0x14
[0.014390] --- interrupt: 901 at console_unlock.part.7+0x3f4/0x528
[0.014390] LR = console_unlock.part.7+0x3f0/0x528
[0.014455] [c0745ee0] [c0050334] console_unlock.part.7+0x114/0x528 
(unreliable)
[0.014542] [c0745f30] [c00524e0] register_console+0x3d8/0x44c
[0.014625] [c0745f60] [c0675aac] cpm_uart_console_init+0x18/0x2c
[0.014709] [c0745f70] [c06614f4] console_init+0x114/0x1cc
[0.014795] [c0745fb0] [c0658b68] start_kernel+0x300/0x3d8
[0.014864] [c0745ff0] [c00022cc] start_here+0x44/0x98

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_40x.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 59f6f53f1ac2..f3bfb695f952 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -132,6 +132,9 @@ _ENTRY(saved_ksp_limit)
tovirt(r1,r11); /* set new kernel sp */ \
rlwinm  r9,r9,0,14,12;  /* clear MSR_WE (necessary?)   */\
stw r0,GPR0(r11);\
+   lis r10, STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */\
+   addir10, r10, STACK_FRAME_REGS_MARKER@l; \
+   stw r10, 8(r11); \
SAVE_4GPRS(3, r11);  \
SAVE_2GPRS(7, r11)
 
@@ -174,6 +177,9 @@ _ENTRY(saved_ksp_limit)
tovirt(r1,r11);  \
rlwinm  r9,r9,0,14,12;  /* clear MSR_WE (necessary?)   */\
stw r0,GPR0(r11);\
+   lis r10, STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */\
+   addir10, r10, STACK_FRAME_REGS_MARKER@l; \
+   stw r10, 8(r11); \
SAVE_4GPRS(3, r11);  \
SAVE_2GPRS(7, r11)
 
-- 
2.13.3

[PATCH 09/11] powerpc/40x: Split and rename NORMAL_EXCEPTION_PROLOG

2019-01-28 Thread Christophe Leroy

This patch splits NORMAL_EXCEPTION_PROLOG in the same way as in
head_8xx.S and head_32.S and renames it EXCEPTION_PROLOG() as well
to match head_32.h

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_40x.S | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index f3bfb695f952..1203075c0c1a 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -103,10 +103,14 @@ _ENTRY(saved_ksp_limit)
  * turned off (i.e. using physical addresses). We assume SPRG_THREAD has
  * the physical address of the current task thread_struct.
  */
-#define NORMAL_EXCEPTION_PROLOG
 \
+#define EXCEPTION_PROLOG\
mtspr   SPRN_SPRG_SCRATCH0,r10; /* save two registers to work with */\
mtspr   SPRN_SPRG_SCRATCH1,r11;  \
mfcrr10;/* save CR in r10 for now  */\
+   EXCEPTION_PROLOG_1;  \
+   EXCEPTION_PROLOG_2
+
+#define EXCEPTION_PROLOG_1  \
mfspr   r11,SPRN_SRR1;  /* check whether user or kernel*/\
andi.   r11,r11,MSR_PR;  \
tophys(r11,r1);  \
@@ -115,7 +119,9 @@ _ENTRY(saved_ksp_limit)
lwz r11,TASK_STACK-THREAD(r11); /* this thread's kernel stack */\
addir11,r11,THREAD_SIZE; \
tophys(r11,r11); \
-1: subir11,r11,INT_FRAME_SIZE; /* Allocate an exception frame */\
+1: subir11,r11,INT_FRAME_SIZE; /* Allocate an exception frame */
+
+#define EXCEPTION_PROLOG_2  \
stw r10,_CCR(r11);  /* save various registers  */\
stw r12,GPR12(r11);  \
stw r9,GPR9(r11);\
@@ -205,7 +211,7 @@ label:
 
 #define EXCEPTION(n, label, hdlr, xfer)\
START_EXCEPTION(n, label);  \
-   NORMAL_EXCEPTION_PROLOG;\
+   EXCEPTION_PROLOG;   \
addir3,r1,STACK_FRAME_OVERHEAD; \
xfer(n, hdlr)
 
@@ -396,7 +402,7 @@ label:
  * This is caused by a fetch from non-execute or guarded pages.
  */
START_EXCEPTION(0x0400, InstructionAccess)
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
mr  r4,r12  /* Pass SRR0 as arg2 */
li  r5,0/* Pass zero as arg3 */
EXC_XFER_LITE(0x400, handle_page_fault)
@@ -406,7 +412,7 @@ label:
 
 /* 0x0600 - Alignment Exception */
START_EXCEPTION(0x0600, Alignment)
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
mfspr   r4,SPRN_DEAR/* Grab the DEAR and save it */
stw r4,_DEAR(r11)
addir3,r1,STACK_FRAME_OVERHEAD
@@ -414,7 +420,7 @@ label:
 
 /* 0x0700 - Program Exception */
START_EXCEPTION(0x0700, ProgramCheck)
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
mfspr   r4,SPRN_ESR /* Grab the ESR and save it */
stw r4,_ESR(r11)
addir3,r1,STACK_FRAME_OVERHEAD
@@ -427,7 +433,7 @@ label:
 
 /* 0x0C00 - System Call Exception */
START_EXCEPTION(0x0C00, SystemCall)
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
EXC_XFER_EE_LITE(0xc00, DoSyscall)
 
EXCEPTION(0x0D00, Trap_0D, unknown_exception, EXC_XFER_EE)
@@ -733,7 +739,7 @@ label:
 
/* Programmable Interval Timer (PIT) Exception. (from 0x1000) */
 Decrementer:
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
lis r0,TSR_PIS@h
mtspr   SPRN_TSR,r0 /* Clear the PIT exception */
addir3,r1,STACK_FRAME_OVERHEAD
@@ -741,7 +747,7 @@ Decrementer:
 
/* Fixed Interval Timer (FIT) Exception. (from 0x1010) */
 FITException:
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
addir3,r1,STACK_FRAME_OVERHEAD;
EXC_XFER_EE(0x1010, unknown_exception)
 
@@ -759,7 +765,7 @@ WDTException:
  * if they can't resolve the lightweight TLB fault.
  */
 DataAccess:
-   NORMAL_EXCEPTION_PROLOG
+   EXCEPTION_PROLOG
mfspr   r5,SPRN_ESR /* Grab the ESR, save it, pass arg3 */
stw r5,_ESR(r11)
mfspr   r4,SPRN_DEAR/* Grab the DEAR, save it, pass arg2 */
-- 
2.13.3

[PATCH 10/11] powerpc/40x: Add EXC_XFER_TEMPLATE_CRITICAL()

2019-01-28 Thread Christophe Leroy

This patch adds EXC_XFER_TEMPLATE_CRITICAL() for handling
transfer to critical exception handler. This will allow
to move the normal exception to using the standard
EXC_XFER_TEMPLATE() defined in head_32.h

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_40x.S | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 1203075c0c1a..d52e460ea85e 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -219,9 +219,7 @@ label:
START_EXCEPTION(n, label);  \
CRITICAL_EXCEPTION_PROLOG;  \
addir3,r1,STACK_FRAME_OVERHEAD; \
-   EXC_XFER_TEMPLATE(hdlr, n+2, (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)), \
- NOCOPY, crit_transfer_to_handler, \
- ret_from_crit_exc)
+   EXC_XFER_TEMPLATE_CRITICAL(hdlr, n+2)
 
 #define EXC_XFER_TEMPLATE(hdlr, trap, msr, copyee, tfer, ret)  \
li  r10,trap;   \
@@ -233,6 +231,14 @@ label:
.long   hdlr;   \
.long   ret
 
+#define EXC_XFER_TEMPLATE_CRITICAL(hdlr, trap) \
+   li  r10,trap;   \
+   stw r10,_TRAP(r11); \
+   li  r10, MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE);  \
+   bl  crit_transfer_to_handler;   \
+   .long   hdlr;   \
+   .long   ret_from_crit_exc
+
 #define COPY_EE(d, s)  rlwimi d,s,0,16,16
 #define NOCOPY(d, s)
 
@@ -733,9 +739,7 @@ label:
/* continue normal handling for a critical exception... */
 2: mfspr   r4,SPRN_DBSR
addir3,r1,STACK_FRAME_OVERHEAD
-   EXC_XFER_TEMPLATE(DebugException, 0x2002, \
-   (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)), \
-   NOCOPY, crit_transfer_to_handler, ret_from_crit_exc)
+   EXC_XFER_TEMPLATE_CRITICAL(DebugException, 0x2002)
 
/* Programmable Interval Timer (PIT) Exception. (from 0x1000) */
 Decrementer:
@@ -755,10 +759,7 @@ FITException:
 WDTException:
CRITICAL_EXCEPTION_PROLOG;
addir3,r1,STACK_FRAME_OVERHEAD;
-   EXC_XFER_TEMPLATE(WatchdogException, 0x1020+2,
- (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)),
- NOCOPY, crit_transfer_to_handler,
- ret_from_crit_exc)
+   EXC_XFER_TEMPLATE_CRITICAL(WatchdogException, 0x1020+2)
 
 /*
  * The other Data TLB exceptions bail out to this point
-- 
2.13.3

[PATCH 11/11] powerpc/40x: Refactor exception entry macros by using head_32.h

2019-01-28 Thread Christophe Leroy

Refactor exception entry macros by using the ones defined in head_32.h

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_40x.S | 94 --
 1 file changed, 8 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index d52e460ea85e..e123bb3aff96 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -44,6 +44,14 @@
 #include 
 #include 
 
+.macro DO_KVM intno
+.endm
+
+/* clear MSR_WE (necessary?)*/
+#define CLR_MSR_WE(reg)rlwinm  reg, reg, 0, ~MSR_WE
+
+#include "head_32.h"
+
 /* As with the other PowerPC ports, it is expected that when code
  * execution begins here, the following registers contain valid, yet
  * optional, information:
@@ -99,52 +107,6 @@ _ENTRY(saved_ksp_limit)
.space  4
 
 /*
- * Exception vector entry code. This code runs with address translation
- * turned off (i.e. using physical addresses). We assume SPRG_THREAD has
- * the physical address of the current task thread_struct.
- */
-#define EXCEPTION_PROLOG\
-   mtspr   SPRN_SPRG_SCRATCH0,r10; /* save two registers to work with */\
-   mtspr   SPRN_SPRG_SCRATCH1,r11;  \
-   mfcrr10;/* save CR in r10 for now  */\
-   EXCEPTION_PROLOG_1;  \
-   EXCEPTION_PROLOG_2
-
-#define EXCEPTION_PROLOG_1  \
-   mfspr   r11,SPRN_SRR1;  /* check whether user or kernel*/\
-   andi.   r11,r11,MSR_PR;  \
-   tophys(r11,r1);  \
-   beq 1f;  \
-   mfspr   r11,SPRN_SPRG_THREAD;   /* if from user, start at top of   */\
-   lwz r11,TASK_STACK-THREAD(r11); /* this thread's kernel stack */\
-   addir11,r11,THREAD_SIZE; \
-   tophys(r11,r11); \
-1: subir11,r11,INT_FRAME_SIZE; /* Allocate an exception frame */
-
-#define EXCEPTION_PROLOG_2  \
-   stw r10,_CCR(r11);  /* save various registers  */\
-   stw r12,GPR12(r11);  \
-   stw r9,GPR9(r11);\
-   mfspr   r10,SPRN_SPRG_SCRATCH0;  \
-   stw r10,GPR10(r11);  \
-   mfspr   r12,SPRN_SPRG_SCRATCH1;  \
-   stw r12,GPR11(r11);  \
-   mflrr10; \
-   stw r10,_LINK(r11);  \
-   mfspr   r12,SPRN_SRR0;   \
-   stw r1,GPR1(r11);\
-   mfspr   r9,SPRN_SRR1;\
-   stw r1,0(r11);   \
-   tovirt(r1,r11); /* set new kernel sp */ \
-   rlwinm  r9,r9,0,14,12;  /* clear MSR_WE (necessary?)   */\
-   stw r0,GPR0(r11);\
-   lis r10, STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */\
-   addir10, r10, STACK_FRAME_REGS_MARKER@l; \
-   stw r10, 8(r11); \
-   SAVE_4GPRS(3, r11);  \
-   SAVE_2GPRS(7, r11)
-
-/*
  * Exception prolog for critical exceptions.  This is a little different
  * from the normal exception prolog above since a critical exception
  * can potentially occur at any point during normal exception processing.
@@ -205,32 +167,12 @@ _ENTRY(saved_ksp_limit)
 /*
  * Exception vectors.
  */
-#defineSTART_EXCEPTION(n, label)   
 \
-   . = n;   \
-label:
-
-#define EXCEPTION(n, label, hdlr, xfer)\
-   START_EXCEPTION(n, label);  \
-   EXCEPTION_PROLOG;   \
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   xfer(n, hdlr)
-
 #define CRITICAL_EXCEPTION(n, label, hdlr) \
START_EXCEPTION(n, label);  \
CRITICAL_EXCEPTION_PROLOG;  \
addir3,r1,STACK_FRAME_OVERHEAD; \

[PATCH v1] PCI/AER: use match_string() helper to simplify the code

2019-01-28 Thread Andy Shevchenko

match_string() returns the index of an array for a matching string,
which can be used intead of open coded implementation.

Signed-off-by: Andy Shevchenko 
---
 drivers/pci/pcie/aer.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index fed29de783e0..f8fc2114ad39 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -117,7 +117,7 @@ bool pci_aer_available(void)
 
 static int ecrc_policy = ECRC_POLICY_DEFAULT;
 
-static const char *ecrc_policy_str[] = {
+static const char * const ecrc_policy_str[] = {
[ECRC_POLICY_DEFAULT] = "bios",
[ECRC_POLICY_OFF] = "off",
[ECRC_POLICY_ON] = "on"
@@ -203,11 +203,8 @@ void pcie_ecrc_get_policy(char *str)
 {
int i;
 
-   for (i = 0; i < ARRAY_SIZE(ecrc_policy_str); i++)
-   if (!strncmp(str, ecrc_policy_str[i],
-strlen(ecrc_policy_str[i])))
-   break;
-   if (i >= ARRAY_SIZE(ecrc_policy_str))
+   i = match_string(ecrc_policy_str, ARRAY_SIZE(ecrc_policy_str), str);
+   if (i < 0)
return;
 
ecrc_policy = i;
-- 
2.20.1

Re: [RFC PATCH 0/2] cxl: Add support for disabling CAPP when unloading CXL

2019-01-28 Thread Frederic Barrat


Hi Vaibhav,

I think there's value in there, as I'm hearing Mellanox would prefer to 
start from a "clean" state, as in pci mode, when they load their driver 
after a card FW update.


However, I'm pretty reluctant to make it the default behaviour. It's not 
like we haven't had our share of problems on reset before :-) 
Furthermore, for non-mlx5 case, we don't really have a reason for this.
I'm thinking we could activate it through a per-adapter property, on 
/sys. See comment in one of the patches.


  Fred


Le 25/01/2019 à 06:11, Vaibhav Jain a écrit :

Recent updates to OPAL have implemented necessary infrastructure [1] to disable
CAPP and switch PHB back to PCIE mode during fast reset. This small patch-set
uses the same OPAL infrastructure to force disable of CAPP when CXL module is
unloaded via rmmod.

References:
[1]: https://lists.ozlabs.org/pipermail/skiboot/2019-January/013063.html

Vaibhav Jain (2):
   powerpc/powernv: Add support for CXL mode switch that need PHB reset
   cxl: Force a CAPP reset when unloading CXL module

  arch/powerpc/platforms/powernv/pci-cxl.c | 71 +---
  drivers/misc/cxl/cxl.h   |  1 +
  drivers/misc/cxl/main.c  |  3 +
  drivers/misc/cxl/pci.c   | 25 -
  4 files changed, 91 insertions(+), 9 deletions(-)

Re: [RFC PATCH 1/2] powerpc/powernv: Add support for CXL mode switch that need PHB reset

2019-01-28 Thread Frederic Barrat





Le 25/01/2019 à 06:11, Vaibhav Jain a écrit :

Recent updates to OPAL [1] have provided support for new CXL modes on
PHB that need to force a cold reset on the bridge (CRESET). However
PHB CRESET is a multi step process and cannot be completed
synchronously as expected by current kernel implementation that issues
opal call opal_pci_set_phb_cxl_mode().

Hence this patch updates pnv_phb_to_cxl_mode() to implement a polling
loop that handles specific error codes (OPAL_BUSY) returned from
opal_pci_set_phb_cxl_mode() and drive the OPAL pci-state machine, if the
requested CXL mode needs a CRESET.

The patch also updates pnv_phb_to_cxl_mode() to convert and return
OPAL error codes into kernel error codes. This removes a previous
issue where callers to this function would have to include
'opal-api.h' to check for specific OPAL error codes.

References:
[1]: https://lists.ozlabs.org/pipermail/skiboot/2019-January/013063.html

Signed-off-by: Vaibhav Jain 
---
  arch/powerpc/platforms/powernv/pci-cxl.c | 71 +---
  1 file changed, 63 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c 
b/arch/powerpc/platforms/powernv/pci-cxl.c
index 1b18111453d7..d33d662c6212 100644
--- a/arch/powerpc/platforms/powernv/pci-cxl.c
+++ b/arch/powerpc/platforms/powernv/pci-cxl.c
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "pci.h"
  
@@ -18,21 +19,75 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode)

struct pci_controller *hose = pci_bus_to_host(dev->bus);
struct pnv_phb *phb = hose->private_data;
struct pnv_ioda_pe *pe;
+   unsigned long starttime, endtime;
int rc;
  
  	pe = pnv_ioda_get_pe(dev);

if (!pe)
-   return -ENODEV;
+   return -ENOENT;
+
+   pe_info(pe, "Switching PHB to CXL mode=%d\n", mode);
+
+   /*
+* Use a 15 second timeout for mode switch. Value arrived after
+* limited testing and may need more tweaking.
+*/
+   starttime = jiffies;
+   endtime = starttime + HZ * 15;
+
+   do {
+   rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode,
+  pe->pe_number);
+
+   /* Wait until mode transistion done */
+   if (rc != OPAL_BUSY && rc != OPAL_BUSY_EVENT)
+   break;
+
+   /* Check if we timedout */
+   if (time_after(jiffies, endtime)) {
+   rc = OPAL_TIMEOUT;
+   break;
+   }
  
-	pe_info(pe, "Switching PHB to CXL\n");

+   /* Opal Busy with mode switch. Run pci state-machine */
+   rc = opal_pci_poll(phb->opal_id);
+   if (rc >= 0) {
+   /* wait for some time */
+   if (rc > 0)
+   msleep(rc);
+   opal_poll_events(NULL);


Why is a call to opal_poll_events() needed?

  Fred



+   rc = OPAL_BUSY;
+   /* Continue with the mode switch */
+   }
+   } while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT);
+
+   pe_level_printk(pe, KERN_DEBUG, "CXL mode switch finished in %u-msecs.",
+   jiffies_to_msecs(jiffies - starttime));
  
-	rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number);

-   if (rc == OPAL_UNSUPPORTED)
-   dev_err(&dev->dev, "Required cxl mode not supported by firmware - 
update skiboot\n");
-   else if (rc)
-   dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode failed: %i\n", 
rc);
+   /* Check OPAL errors and convert them to kernel error codes */
+   switch (rc) {
+   case OPAL_SUCCESS:
+   return 0;
  
-	return rc;

+   case OPAL_PARAMETER:
+   dev_err(&dev->dev, "CXL not supported on this PHB\n");
+   return -ENOENT;
+
+   case OPAL_UNSUPPORTED:
+   dev_err(&dev->dev,
+   "Required cxl mode not supported by firmware"
+   " - update skiboot\n");
+   return -ENODEV;
+
+   case OPAL_TIMEOUT:
+   dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode Timedout\n");
+   return -ETIME;
+
+   default:
+   dev_err(&dev->dev,
+   "opal_pci_set_phb_cxl_mode failed: %i\n", rc);
+   return -EIO;
+   };
  }
  EXPORT_SYMBOL(pnv_phb_to_cxl_mode);

Re: [RFC PATCH 2/2] cxl: Force a CAPP reset when unloading CXL module

2019-01-28 Thread Frederic Barrat





Le 25/01/2019 à 06:11, Vaibhav Jain a écrit :

This patch forces shutdown of CAPP when CXL module is unloaded. This
is accomplished via a call to pnv_phb_to_cxl_mode() with mode ==
OPAL_PHB_CAPI_MODE_PCIE.

Signed-off-by: Vaibhav Jain 
---
  drivers/misc/cxl/cxl.h  |  1 +
  drivers/misc/cxl/main.c |  3 +++
  drivers/misc/cxl/pci.c  | 25 -
  3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index d1d927ccb589..e545c2b81faf 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -1136,4 +1136,5 @@ void cxl_context_mm_count_get(struct cxl_context *ctx);
  /* Decrements the reference count to "struct mm_struct" */
  void cxl_context_mm_count_put(struct cxl_context *ctx);
  
+void cxl_pci_shutdown_capp(void);

  #endif
diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c
index f35406be465a..f14ff0dcf231 100644
--- a/drivers/misc/cxl/main.c
+++ b/drivers/misc/cxl/main.c
@@ -372,6 +372,9 @@ static void exit_cxl(void)
if (cxl_is_power8())
unregister_cxl_calls(&cxl_calls);
idr_destroy(&cxl_adapter_idr);
+
+   if (cpu_has_feature(CPU_FTR_HVMODE))
+   cxl_pci_shutdown_capp();
  }
  
  module_init(init_cxl);

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index c79ba1c699ad..01be2e2d1069 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -25,7 +25,7 @@
  
  #include "cxl.h"

  #include 
-
+#include 
  
  #define CXL_PCI_VSEC_ID	0x1280

  #define CXL_VSEC_MIN_SIZE 0x80
@@ -2065,6 +2065,29 @@ static void cxl_pci_resume(struct pci_dev *pdev)
}
  }
  
+void cxl_pci_shutdown_capp(void)

+{
+   struct pci_dev *pdev;
+   struct pci_bus *root_bus;
+   int rc;
+
+   /* Iterate over all CAPP supported PHB's and force them to PCI mode */
+   list_for_each_entry(root_bus, &pci_root_buses, node) {
+   for_each_pci_bridge(pdev, root_bus) {
+
+   if (!cxllib_slot_is_supported(pdev, 0))
+   continue;
+
+   rc = pnv_phb_to_cxl_mode(pdev,
+OPAL_PHB_CAPI_MODE_PCIE);
+   if (rc)
+   dev_err(&pdev->dev,
+   "cxl: Error resetting CAPP. Err=%d\n",
+   rc);
+   }



That's the part I don't like. We're iterating over quite a few PCI 
devices, we basically don't know the ones we need to reset.
If we have a per-adapter property on /sys to activate the 
reset-on-unload, then we could move the call to 
pnv_phb_to_cxl_mode(OPAL_PHB_CAPI_MODE_PCIE) on the cxl_remove() 
callback, and only do it for the adapters we've been asked.


  Fred




+   }
+}
+
  static const struct pci_error_handlers cxl_err_handler = {
.error_detected = cxl_pci_error_detected,
.slot_reset = cxl_pci_slot_reset,

Re: [PATCH 00/11] Refactor exception entry on 40x/6xx/8xx

2019-01-28 Thread Christoph Hellwig

On Mon, Jan 28, 2019 at 11:11:10AM +, Christophe Leroy wrote:
> This serie refactors exception entry macros for 40x, 6xx and 8xx
> 
> This serie will benefit to the implementation of CONFIG_VMAP, and also
> to Ben's serie on MSR_EE.

We don't have a CONFIG_VMAP.  Do you mean CONFIG_VMAP_STACK ?

Re: [PATCH] ucc_geth: Reset BQL queue when stopping device

2019-01-28 Thread Mathias Thore

Hi,


This is what we observed: there was a storm on the medium so that our 
controller could not do its TX, resulting in timeout. When timeout occurs, the 
driver clears all descriptors from the TX queue. The function called in this 
patch is used to reflect this clearing also in the BQL layer. Without it, the 
controller would get stuck, unable to perform TX, even several minutes after 
the storm had ended. Bringing the device down and then up again would solve the 
problem, but this patch also solves it automatically.


Some other drivers do the same, for example e1000e driver calls 
netdev_reset_queue in its e1000_clean_tx_ring function. It is possible that 
other drivers should do the same; I have no way of verifying this.


Regards,

Mathias

--


From: Christophe Leroy 
Sent: Monday, January 28, 2019 10:48 AM
To: Mathias Thore; leoyang...@nxp.com; net...@vger.kernel.org; 
linuxppc-dev@lists.ozlabs.org; David Gounaris; Joakim Tjernlund
Subject: Re: [PATCH] ucc_geth: Reset BQL queue when stopping device
  

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


Hi,

Le 28/01/2019 à 10:07, Mathias Thore a écrit :
> After a timeout event caused by for example a broadcast storm, when
> the MAC and PHY are reset, the BQL TX queue needs to be reset as
> well. Otherwise, the device will exhibit severe performance issues
> even after the storm has ended.

What are the symptomns ?

Is this reset needed on any network driver in that case, or is it
something particular for the ucc_geth ?
For instance, the freescale fs_enet doesn't have that reset. Should it
have it too ?

Christophe

>
> Co-authored-by: David Gounaris 
> Signed-off-by: Mathias Thore 
> ---
>   drivers/net/ethernet/freescale/ucc_geth.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
> b/drivers/net/ethernet/freescale/ucc_geth.c
> index c3d539e209ed..eb3e65e8868f 100644
> --- a/drivers/net/ethernet/freescale/ucc_geth.c
> +++ b/drivers/net/ethernet/freescale/ucc_geth.c
> @@ -1879,6 +1879,8 @@ static void ucc_geth_free_tx(struct ucc_geth_private 
> *ugeth)
>   u16 i, j;
>   u8 __iomem *bd;
>
> + netdev_reset_queue(ugeth->ndev);
> +
>   ug_info = ugeth->ug_info;
>   uf_info = &ug_info->uf_info;
>
>

Re: [PATCH 00/11] Refactor exception entry on 40x/6xx/8xx

2019-01-28 Thread Christophe Leroy





Le 28/01/2019 à 15:15, Christoph Hellwig a écrit :

On Mon, Jan 28, 2019 at 11:11:10AM +, Christophe Leroy wrote:

This serie refactors exception entry macros for 40x, 6xx and 8xx

This serie will benefit to the implementation of CONFIG_VMAP, and also
to Ben's serie on MSR_EE.


We don't have a CONFIG_VMAP.  Do you mean CONFIG_VMAP_STACK ?



Yes that's what I mean, sorry.

Re: [PATCH v4] kbuild: Add support for DT binding schema checks

2019-01-28 Thread Rob Herring

On Mon, Jan 28, 2019 at 3:43 AM Geert Uytterhoeven  wrote:
>
> Hi Rob,
>
> On Tue, Dec 11, 2018 at 9:24 PM Rob Herring  wrote:
> > This adds the build infrastructure for checking DT binding schema
> > documents and validating dts files using the binding schema.
> >
> > Check DT binding schema documents:
> > make dt_binding_check
> >
> > Build dts files and check using DT binding schema:
> > make dtbs_check
> >
> > Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to
> > use for validation. This makes it easier to find and fix errors
> > generated by a specific schema.
> >
> > Currently, the validation targets are separate from a normal build to
> > avoid a hard dependency on the external DT schema project and because
> > there are lots of warnings generated.
> >
> > Cc: Jonathan Corbet 
> > Cc: Mark Rutland 
> > Cc: Masahiro Yamada 
> > Cc: Michal Marek 
> > Cc: linux-...@vger.kernel.org
> > Cc: devicet...@vger.kernel.org
> > Cc: linux-kbu...@vger.kernel.org
> > Signed-off-by: Rob Herring 
>
> BTW, what are the CONFIG dependencies for this to work?
> E.g. defconfig on x86_64 fails, even after enabling CONFIG_OF:

I generally use allmodconfig which enables building all DTs.

Yes, there's a dependency on CONFIG_DTC which isn't always enabled
with CONFIG_OF. Maybe it should be. The only other solutions I've
thought of are either always build dtc or make the targets conditional
on CONFIG_DTC. The latter would only change the error message.

Rob

Re: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration

2019-01-28 Thread Michael Bringmann

On 10/29/18 1:43 PM, Nathan Fontenot wrote:
> On pseries systems, performing a partition migration can result in
> altering the nodes a CPU is assigned to on the destination system. For
> exampl, pre-migration on the source system CPUs are in node 1 and 3,
> post-migration on the destination system CPUs are in nodes 2 and 3.
> 
> Handling the node change for a CPU can cause corruption in the slab
> cache if we hit a timing where a CPUs node is changed while cache_reap()
> is invoked. The corruption occurs because the slab cache code appears
> to rely on the CPU and slab cache pages being on the same node.
> 
> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
> does not prevent us from hitting this scenario.
> 
> Changing the device tree property update notification handler that
> recognizes an affinity change for a CPU to do a full DLPAR remove and
> add of the CPU instead of dynamically changing its node resolves this
> issue.
> 
> Signed-off-by: Nathan Fontenot 

> ---
>  arch/powerpc/include/asm/topology.h  |2 ++
>  arch/powerpc/mm/numa.c   |9 +
>  arch/powerpc/platforms/pseries/hotplug-cpu.c |   19 +++
>  3 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/topology.h 
> b/arch/powerpc/include/asm/topology.h
> index a4a718dbfec6..f85e2b01c3df 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -132,6 +132,8 @@ static inline void shared_proc_topology_init(void) {}
>  #define topology_sibling_cpumask(cpu)(per_cpu(cpu_sibling_map, cpu))
>  #define topology_core_cpumask(cpu)   (per_cpu(cpu_core_map, cpu))
>  #define topology_core_id(cpu)(cpu_to_core_id(cpu))
> +
> +int dlpar_cpu_readd(int cpu);
>  #endif
>  #endif
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 693ae1c1acba..bb6a7b56bef7 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1461,13 +1461,6 @@ static void reset_topology_timer(void)
> 
>  #ifdef CONFIG_SMP
> 
> -static void stage_topology_update(int core_id)
> -{
> - cpumask_or(&cpu_associativity_changes_mask,
> - &cpu_associativity_changes_mask, cpu_sibling_mask(core_id));
> - reset_topology_timer();
> -}
> -
>  static int dt_update_callback(struct notifier_block *nb,
>   unsigned long action, void *data)
>  {
> @@ -1480,7 +1473,7 @@ static int dt_update_callback(struct notifier_block *nb,
>   !of_prop_cmp(update->prop->name, "ibm,associativity")) {
>   u32 core_id;
>   of_property_read_u32(update->dn, "reg", &core_id);
> - stage_topology_update(core_id);
> + rc = dlpar_cpu_readd(core_id);
>   rc = NOTIFY_OK;
>   }
>   break;
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
> b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 2f8e62163602..97feb6e79f1a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -802,6 +802,25 @@ static int dlpar_cpu_add_by_count(u32 cpus_to_add)
>   return rc;
>  }
> 
> +int dlpar_cpu_readd(int cpu)
> +{
> + struct device_node *dn;
> + struct device *dev;
> + u32 drc_index;
> + int rc;
> +
> + dev = get_cpu_device(cpu);
> + dn = dev->of_node;
> +
> + rc = of_property_read_u32(dn, "ibm,my-drc-index", &drc_index);
> +
> + rc = dlpar_cpu_remove_by_index(drc_index);
> + if (!rc)
> + rc = dlpar_cpu_add(drc_index);
> +
> + return rc;
> +}
> +
>  int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
>  {
>   u32 count, drc_index;
> 
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com

Re: [PATCH v4] kbuild: Add support for DT binding schema checks

2019-01-28 Thread Geert Uytterhoeven

Hi Rob,

On Mon, Jan 28, 2019 at 4:35 PM Rob Herring  wrote:
> On Mon, Jan 28, 2019 at 3:43 AM Geert Uytterhoeven  
> wrote:
> > On Tue, Dec 11, 2018 at 9:24 PM Rob Herring  wrote:
> > > This adds the build infrastructure for checking DT binding schema
> > > documents and validating dts files using the binding schema.
> > >
> > > Check DT binding schema documents:
> > > make dt_binding_check
> > >
> > > Build dts files and check using DT binding schema:
> > > make dtbs_check
> > >
> > > Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to
> > > use for validation. This makes it easier to find and fix errors
> > > generated by a specific schema.
> > >
> > > Currently, the validation targets are separate from a normal build to
> > > avoid a hard dependency on the external DT schema project and because
> > > there are lots of warnings generated.

> > BTW, what are the CONFIG dependencies for this to work?
> > E.g. defconfig on x86_64 fails, even after enabling CONFIG_OF:
>
> I generally use allmodconfig which enables building all DTs.
>
> Yes, there's a dependency on CONFIG_DTC which isn't always enabled
> with CONFIG_OF. Maybe it should be. The only other solutions I've

Oh, didn't think of CONFIG_DTC.

> thought of are either always build dtc or make the targets conditional
> on CONFIG_DTC. The latter would only change the error message.

Making the target conditional may make it more obvious to the user
what's going on. Cfr. "make modules_install" giving a nice explanation
when CONFIG_MODULES=n.

Thanks.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH AUTOSEL 4.20 027/304] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 7625546caefd..17958043e7f7 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -270,6 +270,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

Re: [REPOST PATCH v08 0/5] powerpc/hotplug: Update affinity for migrated CPUs

2019-01-28 Thread Michael Bringmann

On 8/21/18 10:33 AM, m...@linux.vnet.ibm.com wrote:
> The migration of LPARs across Power systems affects many attributes
> including that of the associativity of CPUs.  The patches in this
> set execute when a system is coming up fresh upon a migration target.
> They are intended to,
> 
> * Recognize changes to the associativity of CPUs recorded in internal
>   data structures when compared to the latest copies in the device tree.
> * Generate calls to other code layers to reset the data structures
>   related to associativity of the CPUs.
> * Re-register the 'changed' entities into the target system.
>   Re-registration of CPUs mostly entails acting as if they have been
>   newly hot-added into the target system.
> 
> Signed-off-by: Michael Bringmann 

Retract this series in preference to
 [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update 
post-migration

Michael

> 
> Michael Bringmann (5):
>   hotplug/cpu: Conditionally acquire/release DRC index
>   hotplug/cpu: Add operation queuing function
>   hotplug/cpu: Provide CPU readd operation
>   mobility/numa: Ensure numa update does not overlap
>   hotplug/pmt: Update topology after PMT
> ---
> Changes in patch:
>   -- Restructure and rearrange content of patches to co-locate
>  similar or related modifications
>   -- Rename pseries_update_drconf_cpu to pseries_update_processor
>   -- Simplify code to update CPU nodes during mobility checks.
>  Remove functions to generate extra HP_ELOG messages in favor
>  of direct function calls to dlpar_cpu_readd_by_index.
>   -- Revise code order in dlpar_cpu_readd_by_index() to present
>  more appropriate error codes from underlying layers of the
>  implementation.
>   -- Add hotplug device lock around all property updates
>   -- Add call to rebuild_sched_domains in case of changes
>   -- Various code cleanups and compaction
>   -- Rebase to 4.18 kernel
>   -- Change operation to run CPU readd after end of migration store.
>   -- Improve descriptive text
>   -- Cleanup patch reference to outdated function
>   -- Code cleanup a 'acquire_drc' check in dlpar_cpu_add.
>   -- Code cleanup a 'release_drc' check in dlpar_cpu_remove.
>   -- Add more information to patch descriptions.
>   -- More code cleanup
>   -- Rearrange call to rebuild_sched_domains to allow removal
>  of some locking code.
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com

[PATCH AUTOSEL 4.20 078/304] powerpc/32: Add .data..Lubsan_data/.data..Lubsan_type sections explicitly

2019-01-28 Thread Sasha Levin

From: Mathieu Malaterre 

[ Upstream commit beba24ac59133cb36ecd03f9af9ccb11971ee20e ]

When both `CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y` and `CONFIG_UBSAN=y`
are set, link step typically produce numberous warnings about orphan
section:

  + powerpc-linux-gnu-ld -EB -m elf32ppc -Bstatic --orphan-handling=warn 
--build-id --gc-sections -X -o .tmp_vmlinux1 -T 
./arch/powerpc/kernel/vmlinux.lds --who
  le-archive built-in.a --no-whole-archive --start-group lib/lib.a --end-group
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data393' from 
`init/main.o' being placed in section `.data..Lubsan_data393'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data394' from 
`init/main.o' being placed in section `.data..Lubsan_data394'.
  ...
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type11' from 
`init/main.o' being placed in section `.data..Lubsan_type11'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type12' from 
`init/main.o' being placed in section `.data..Lubsan_type12'.
  ...

This commit remove those warnings produced at W=1.

Link: https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg135407.html
Suggested-by: Nicholas Piggin 
Signed-off-by: Mathieu Malaterre 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 434581bcd5b4..1148c3c60c3b 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -308,6 +308,10 @@ SECTIONS
 #ifdef CONFIG_PPC32
.data : AT(ADDR(.data) - LOAD_OFFSET) {
DATA_DATA
+#ifdef CONFIG_UBSAN
+   *(.data..Lubsan_data*)
+   *(.data..Lubsan_type*)
+#endif
*(.data.rel*)
*(SDATA_MAIN)
*(.sdata2)
-- 
2.19.1

Re: [PATCH v4] kbuild: Add support for DT binding schema checks

2019-01-28 Thread Rob Herring

On Mon, Jan 28, 2019 at 2:59 AM Geert Uytterhoeven  wrote:
>
> Hi Rob,
>
> On Sun, Jan 27, 2019 at 4:00 AM Rob Herring  wrote:
> > On Wed, Jan 23, 2019 at 9:33 AM Geert Uytterhoeven  
> > wrote:
> > > On Tue, Dec 11, 2018 at 9:24 PM Rob Herring  wrote:
> > > > This adds the build infrastructure for checking DT binding schema
> > > > documents and validating dts files using the binding schema.
> > > >
> > > > Check DT binding schema documents:
> > > > make dt_binding_check
> > > >
> > > > Build dts files and check using DT binding schema:
> > > > make dtbs_check
> > > >
> > > > Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to
> > > > use for validation. This makes it easier to find and fix errors
> > > > generated by a specific schema.
> > > >
> > > > Currently, the validation targets are separate from a normal build to
> > > > avoid a hard dependency on the external DT schema project and because
> > > > there are lots of warnings generated.
> > >
> > > Thanks, I'm giving this a try, and get errors like:
> > >
> > >   DTC arch/arm/boot/dts/emev2-kzm9d.dt.yaml
> > > FATAL ERROR: No markers present in property 'cpu0' value
> > >
> > > and
> > >
> > >   DTC arch/arm64/boot/dts/renesas/r8a7795-salvator-x.dt.yaml
> > > FATAL ERROR: No markers present in property 'audio_clk_a' value
> > >
> > > Do you have a clue?
> >
> > That's really strange because those aren't even properties. Are other
> > dts files okay? This is the in tree dtc?
> >
> > The only time you should be missing markers is if you did a dts -> dts
> > -> dt.yaml.
>
> Found it: make dtbs_check doesn't play well with my local change to
> add symbols for DT overlays:

Now that makes sense.

> --- a/scripts/Makefile.lib
> +++ b/scripts/Makefile.lib
> @@ -285,6 +285,10 @@ cmd_dt_S_dtb=
>  \
>  $(obj)/%.dtb.S: $(obj)/%.dtb FORCE
> $(call if_changed,dt_S_dtb)
>
> +ifeq ($(CONFIG_OF_OVERLAY),y)
> +DTC_FLAGS += -@
> +endif
> +
>  quiet_cmd_dtc = DTC $@
>  cmd_dtc = mkdir -p $(dir ${dtc-tmp}) ; \
> $(HOSTCC) -E $(dtc_cpp_flags) -x assembler-with-cpp -o
> $(dtc-tmp) $< ; \
>
> Do you see a way to handle that better?

We need to have the code that generates these properties to also add
markers. Or we could drop the __symbols__ nodes on YAML output. Or
ignore the option when doing YAML output.

> Apart from a few expected issues, I'm seeing one other strange message:
>
> arch/arm/boot/dts/sh73a0-kzm9g.dt.yaml: interrupts: [[2, 4], [3,
> 4]] is too long
>
> This is the interrupts property in the adi,adxl345 node in
> arch/arm/boot/dts/sh73a0-kzm9g.dts.
> Apparently the check complains if more than one interrupt is listed here.
> Is this a known issue?

There are lots of warnings... I've gone thru and checked some to make
sure they are valid, but certainly not all. There's probably some
cases that are too strict too.

This one is because this device is listed in trivial-devices.yaml and
you can't have 2 interrupts for a trivial device (because you need to
define the interrupt order). Looks like we have a binding doc for it
too, so we should just remove it from trivial-devices.yaml.

There are lots of '... is too (long|short)' messages because of how
the dts file property values are bracketed. This used to not matter,
but is significant in the YAML output. I have a dtc patch to give
warnings on all these (and dtc will give source location).

Rob

[PATCH AUTOSEL 4.20 181/304] KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines

2019-01-28 Thread Sasha Levin

From: Suraj Jitindar Singh 

[ Upstream commit 693ac10a88a2219bde553b2e8460dbec97e594e6 ]

The kvm capability KVM_CAP_SPAPR_TCE_VFIO is used to indicate the
availability of in kernel tce acceleration for vfio. However it is
currently the case that this is only available on a powernv machine,
not for a pseries machine.

Thus make this capability dependent on having the cpu feature
CPU_FTR_HVMODE.

[pau...@ozlabs.org - fixed compilation for Book E.]

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/powerpc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 2869a299c4ed..75e2e471442f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -543,8 +543,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #ifdef CONFIG_PPC_BOOK3S_64
case KVM_CAP_SPAPR_TCE:
case KVM_CAP_SPAPR_TCE_64:
-   /* fallthrough */
+   r = 1;
+   break;
case KVM_CAP_SPAPR_TCE_VFIO:
+   r = !!cpu_has_feature(CPU_FTR_HVMODE);
+   break;
case KVM_CAP_PPC_RTAS:
case KVM_CAP_PPC_FIXUP_HCALL:
case KVM_CAP_PPC_ENABLE_HCALL:
-- 
2.19.1

[PATCH AUTOSEL 4.20 222/304] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 15bea9a0f260..ebc0b916dcf9 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -63,7 +63,7 @@ static inline int __access_ok(unsigned long addr, unsigned 
long size,
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

[PATCH AUTOSEL 4.20 233/304] powerpc/perf: Fix thresholding counter data for unknown type

2019-01-28 Thread Sasha Levin

From: Madhavan Srinivasan 

[ Upstream commit 17cfccc91545682513541924245abb876d296063 ]

MMCRA[34:36] and MMCRA[38:44] expose the thresholding counter value.
Thresholding counter can be used to count latency cycles such as
load miss to reload. But threshold counter value is not relevant
when the sampled instruction type is unknown or reserved. Patch to
fix the thresholding counter value to zero when sampled instruction
type is unknown or reserved.

Fixes: 170a315f41c6('powerpc/perf: Support to export MMCRA[TEC*] field to 
userspace')
Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/isa207-common.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 177de814286f..6a2f65d3d088 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -226,8 +226,13 @@ void isa207_get_mem_weight(u64 *weight)
u64 mmcra = mfspr(SPRN_MMCRA);
u64 exp = MMCRA_THR_CTR_EXP(mmcra);
u64 mantissa = MMCRA_THR_CTR_MANT(mmcra);
+   u64 sier = mfspr(SPRN_SIER);
+   u64 val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT;
 
-   *weight = mantissa << (2 * exp);
+   if (val == 0 || val == 7)
+   *weight = 0;
+   else
+   *weight = mantissa << (2 * exp);
 }
 
 int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
-- 
2.19.1

[PATCH AUTOSEL 4.20 235/304] powerpc/powernv/ioda: Allocate indirect TCE levels of cached userspace addresses on demand

2019-01-28 Thread Sasha Levin

From: Alexey Kardashevskiy 

[ Upstream commit bdbf649efe21173cae63b4b71db84176420f9039 ]

The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
table and a table with userspace addresses; the latter is used for
marking pages dirty when corresponging TCEs are unmapped from
the hardware table.

a68bd1267b72 ("powerpc/powernv/ioda: Allocate indirect TCE levels
on demand") enabled on-demand allocation of the hardware table,
however it missed the other table so it has still been fully allocated
at the boot time. This fixes the issue by allocating a single level,
just like we do for the hardware table.

Fixes: a68bd1267b72 ("powerpc/powernv/ioda: Allocate indirect TCE levels on 
demand")
Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/pci-ioda-tce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c 
b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
index fe9691040f54..7639b2168755 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
@@ -299,7 +299,7 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 
bus_offset,
if (alloc_userspace_copy) {
offset = 0;
uas = pnv_pci_ioda2_table_do_alloc_pages(nid, level_shift,
-   levels, tce_table_size, &offset,
+   tmplevels, tce_table_size, &offset,
&total_allocated_uas);
if (!uas)
goto free_tces_exit;
-- 
2.19.1

[PATCH AUTOSEL 4.20 253/304] powerpc/mm: Fix reporting of kernel execute faults on the 8xx

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit ffca395b11c4a5a6df6d6345f794b0e3d578e2d0 ]

On the 8xx, no-execute is set via PPP bits in the PTE. Therefore
a no-exec fault generates DSISR_PROTFAULT error bits,
not DSISR_NOEXEC_OR_G.

This patch adds DSISR_PROTFAULT in the test mask.

Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 1697e903bbf2..50e5c790d11e 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -226,7 +226,9 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr,
 static bool bad_kernel_fault(bool is_exec, unsigned long error_code,
 unsigned long address)
 {
-   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT))) {
+   /* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on 
others */
+   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
+ DSISR_PROTFAULT))) {
printk_ratelimited(KERN_CRIT "kernel tried to execute"
   " exec-protected page (%lx) -"
   "exploit attempt? (uid: %d)\n",
-- 
2.19.1

[PATCH AUTOSEL 4.20 258/304] powerpc/fadump: Do not allow hot-remove memory from fadump reserved area.

2019-01-28 Thread Sasha Levin

From: Mahesh Salgaonkar 

[ Upstream commit 0db6896ff6332ba694f1e61b93ae3b2640317633 ]

For fadump to work successfully there should not be any holes in reserved
memory ranges where kernel has asked firmware to move the content of old
kernel memory in event of crash. Now that fadump uses CMA for reserved
area, this memory area is now not protected from hot-remove operations
unless it is cma allocated. Hence, fadump service can fail to re-register
after the hot-remove operation, if hot-removed memory belongs to fadump
reserved region. To avoid this make sure that memory from fadump reserved
area is not hot-removable if fadump is registered.

However, if user still wants to remove that memory, he can do so by
manually stopping fadump service before hot-remove operation.

Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/fadump.h   |  2 +-
 arch/powerpc/kernel/fadump.c| 10 --
 arch/powerpc/platforms/pseries/hotplug-memory.c |  7 +--
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index 1e7a33592e29..15bc07a31c46 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -200,7 +200,7 @@ struct fad_crash_memory_ranges {
unsigned long long  size;
 };
 
-extern int is_fadump_boot_memory_area(u64 addr, ulong size);
+extern int is_fadump_memory_area(u64 addr, ulong size);
 extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
 extern int fadump_reserve_mem(void);
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 761b28b1427d..7fd9b3e1fa39 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -118,13 +118,19 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 
 /*
  * If fadump is registered, check if the memory provided
- * falls within boot memory area.
+ * falls within boot memory area and reserved memory area.
  */
-int is_fadump_boot_memory_area(u64 addr, ulong size)
+int is_fadump_memory_area(u64 addr, ulong size)
 {
+   u64 d_start = fw_dump.reserve_dump_area_start;
+   u64 d_end = d_start + fw_dump.reserve_dump_area_size;
+
if (!fw_dump.dump_registered)
return 0;
 
+   if (((addr + size) > d_start) && (addr <= d_end))
+   return 1;
+
return (addr + size) > RMA_START && addr <= fw_dump.boot_memory_size;
 }
 
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2a983b5a52e1..2318ab29d5dd 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -355,8 +355,11 @@ static bool lmb_is_removable(struct drmem_lmb *lmb)
phys_addr = lmb->base_addr;
 
 #ifdef CONFIG_FA_DUMP
-   /* Don't hot-remove memory that falls in fadump boot memory area */
-   if (is_fadump_boot_memory_area(phys_addr, block_sz))
+   /*
+* Don't hot-remove memory that falls in fadump boot memory area
+* and memory that is reserved for capturing old kernel memory.
+*/
+   if (is_fadump_memory_area(phys_addr, block_sz))
return false;
 #endif
 
-- 
2.19.1

[PATCH AUTOSEL 4.20 286/304] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index c1c676a33e4a..3f6df3f1f5d9 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -995,7 +995,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

[PATCH AUTOSEL 4.20 291/304] block/swim3: Fix regression on PowerBook G3

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 427c5ce4417cba0801fbf79c8525d1330704759c ]

As of v4.20, the swim3 driver crashes when loaded on a PowerBook G3
(Wallstreet).

MacIO PCI driver attached to Gatwick chipset
MacIO PCI driver attached to Heathrow chipset
swim3 0.00015000:floppy: [fd0] SWIM3 floppy controller in media bay
0.00013020:ch-a: ttyS0 at MMIO 0xf3013020 (irq = 16, base_baud = 230400) is a 
Z85c30 ESCC - Serial port
0.00013000:ch-b: ttyS1 at MMIO 0xf3013000 (irq = 17, base_baud = 230400) is a 
Z85c30 ESCC - Infrared port
macio: fixed media-bay irq on gatwick
macio: fixed left floppy irqs
swim3 1.00015000:floppy: [fd1] Couldn't request interrupt
Unable to handle kernel paging request for data at address 0x0024
Faulting instruction address: 0xc02652f8
Oops: Kernel access of bad area, sig: 11 [#1]
BE SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0 #2
NIP:  c02652f8 LR: c026915c CTR: c0276d1c
REGS: df43ba10 TRAP: 0300   Not tainted  (4.20.0)
MSR:  9032   CR: 28228288  XER: 0100
DAR: 0024 DSISR: 4000
GPR00: c026915c df43bac0 df439060 c0731524 df494700  c06e1c08 0001
GPR08: 0001  df5ff220 1032 28228282  c0004ca4 
GPR16:    c073144c dfffe064 c0731524 0120 c0586108
GPR24: c073132c c073143c c073143c  c0731524 df67cd70 df494700 0001
NIP [c02652f8] blk_mq_free_rqs+0x28/0xf8
LR [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
Call Trace:
[df43bac0] [c0045f50] flush_workqueue_prep_pwqs+0x178/0x1c4 (unreliable)
[df43bae0] [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
[df43bb00] [c02697f0] blk_mq_exit_sched+0x9c/0xb8
[df43bb20] [c0252794] elevator_exit+0x84/0xa4
[df43bb40] [c0256538] blk_exit_queue+0x30/0x50
[df43bb50] [c0256640] blk_cleanup_queue+0xe8/0x184
[df43bb70] [c034732c] swim3_attach+0x330/0x5f0
[df43bbb0] [c034fb24] macio_device_probe+0x58/0xec
[df43bbd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43bc00] [c032bd28] driver_probe_device+0x64/0x204
[df43bc20] [c0329ac4] bus_for_each_drv+0x60/0xac
[df43bc50] [c032b824] __device_attach+0xe8/0x160
[df43bc80] [c032ab38] bus_probe_device+0xa0/0xbc
[df43bca0] [c0327338] device_add+0x3d8/0x630
[df43bcf0] [c0350848] macio_add_one_device+0x444/0x48c
[df43bd50] [c03509f8] macio_pci_add_devices+0x168/0x1bc
[df43bd90] [c03500ec] macio_pci_probe+0xc0/0x10c
[df43bda0] [c02ad884] pci_device_probe+0xd4/0x184
[df43bdd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43be00] [c032bd28] driver_probe_device+0x64/0x204
[df43be20] [c032bfcc] __driver_attach+0x104/0x108
[df43be40] [c0329a00] bus_for_each_dev+0x64/0xb4
[df43be70] [c032add8] bus_add_driver+0x154/0x238
[df43be90] [c032ca24] driver_register+0x84/0x148
[df43bea0] [c0004aa0] do_one_initcall+0x40/0x188
[df43bf00] [c0690100] kernel_init_freeable+0x138/0x1d4
[df43bf30] [c0004cbc] kernel_init+0x18/0x10c
[df43bf40] [c00121e4] ret_from_kernel_thread+0x14/0x1c
Instruction dump:
5484d97e 4bfff4f4 9421ffe0 7c0802a6 bf410008 7c9e2378 90010024 8124005c
2f89 419e0078 81230004 7c7c1b78 <81290024> 2f89 419e0064 8144
---[ end trace 12025ab921a9784c ]---

Reverting commit 8ccb8cb1892b ("swim3: convert to blk-mq") resolves the
problem.

That commit added a struct blk_mq_tag_set to struct floppy_state and
initialized it with a blk_mq_init_sq_queue() call. Unfortunately, there
is a memset() in swim3_add_device() that subsequently clears the
floppy_state struct. That means fs->tag_set->ops is a NULL pointer, and
it gets dereferenced by blk_mq_free_rqs() which gets called in the
request_irq() error path. Move the memset() to fix this bug.

BTW, the request_irq() failure for the left mediabay floppy (fd1) is not
a regression. I don't know why it happens. The right media bay floppy
(fd0) works fine however.

Reported-and-tested-by: Stan Johnson 
Fixes: 8ccb8cb1892b ("swim3: convert to blk-mq")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 

Signed-off-by: Jens Axboe 

Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 3f6df3f1f5d9..1046459f172b 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1091,8 +1091,6 @@ static int swim3_add_device(struct macio_dev *mdev, int 
index)
struct floppy_state *fs = &floppy_states[index];
int rc = -EBUSY;
 
-   /* Do this first for message macros */
-   memset(fs, 0, sizeof(*fs));
fs->mdev = mdev;
fs->index = index;
 
@@ -1192,14 +1190,15 @@ static int swim3_attach(struct macio_dev *mdev,
return rc;
}
 
-   fs = &floppy_states[floppy_count];
-
disk = alloc_disk(1);
if (disk == NULL) {
rc = -ENOMEM;
goto out_unregister;
}
 
+   fs = &floppy_states[floppy_count];
+   memset(fs, 0, sizeof(*fs));
+
disk->queue = blk_mq_init_sq_queue(&fs->tag_set, &swim3_mq_ops,

[PATCH AUTOSEL 4.19 023/258] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index a0b20c03f078..e3010b14aea5 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -272,6 +272,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.19 065/258] powerpc/32: Add .data..Lubsan_data/.data..Lubsan_type sections explicitly

2019-01-28 Thread Sasha Levin

From: Mathieu Malaterre 

[ Upstream commit beba24ac59133cb36ecd03f9af9ccb11971ee20e ]

When both `CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y` and `CONFIG_UBSAN=y`
are set, link step typically produce numberous warnings about orphan
section:

  + powerpc-linux-gnu-ld -EB -m elf32ppc -Bstatic --orphan-handling=warn 
--build-id --gc-sections -X -o .tmp_vmlinux1 -T 
./arch/powerpc/kernel/vmlinux.lds --who
  le-archive built-in.a --no-whole-archive --start-group lib/lib.a --end-group
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data393' from 
`init/main.o' being placed in section `.data..Lubsan_data393'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data394' from 
`init/main.o' being placed in section `.data..Lubsan_data394'.
  ...
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type11' from 
`init/main.o' being placed in section `.data..Lubsan_type11'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type12' from 
`init/main.o' being placed in section `.data..Lubsan_type12'.
  ...

This commit remove those warnings produced at W=1.

Link: https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg135407.html
Suggested-by: Nicholas Piggin 
Signed-off-by: Mathieu Malaterre 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 07ae018e550e..53016c753f3c 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -296,6 +296,10 @@ SECTIONS
 #ifdef CONFIG_PPC32
.data : AT(ADDR(.data) - LOAD_OFFSET) {
DATA_DATA
+#ifdef CONFIG_UBSAN
+   *(.data..Lubsan_data*)
+   *(.data..Lubsan_type*)
+#endif
*(.data.rel*)
*(SDATA_MAIN)
*(.sdata2)
-- 
2.19.1

[PATCH AUTOSEL 4.19 150/258] KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines

2019-01-28 Thread Sasha Levin

From: Suraj Jitindar Singh 

[ Upstream commit 693ac10a88a2219bde553b2e8460dbec97e594e6 ]

The kvm capability KVM_CAP_SPAPR_TCE_VFIO is used to indicate the
availability of in kernel tce acceleration for vfio. However it is
currently the case that this is only available on a powernv machine,
not for a pseries machine.

Thus make this capability dependent on having the cpu feature
CPU_FTR_HVMODE.

[pau...@ozlabs.org - fixed compilation for Book E.]

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/powerpc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index eba5756d5b41..79b79408d92e 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -543,8 +543,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #ifdef CONFIG_PPC_BOOK3S_64
case KVM_CAP_SPAPR_TCE:
case KVM_CAP_SPAPR_TCE_64:
-   /* fallthrough */
+   r = 1;
+   break;
case KVM_CAP_SPAPR_TCE_VFIO:
+   r = !!cpu_has_feature(CPU_FTR_HVMODE);
+   break;
case KVM_CAP_PPC_RTAS:
case KVM_CAP_PPC_FIXUP_HCALL:
case KVM_CAP_PPC_ENABLE_HCALL:
-- 
2.19.1

Re: [RFC PATCH 1/2] powerpc/powernv: Add support for CXL mode switch that need PHB reset

2019-01-28 Thread christophe lombard


On 25/01/2019 06:11, Vaibhav Jain wrote:

Recent updates to OPAL [1] have provided support for new CXL modes on
PHB that need to force a cold reset on the bridge (CRESET). However
PHB CRESET is a multi step process and cannot be completed
synchronously as expected by current kernel implementation that issues
opal call opal_pci_set_phb_cxl_mode().

Hence this patch updates pnv_phb_to_cxl_mode() to implement a polling
loop that handles specific error codes (OPAL_BUSY) returned from
opal_pci_set_phb_cxl_mode() and drive the OPAL pci-state machine, if the
requested CXL mode needs a CRESET.

The patch also updates pnv_phb_to_cxl_mode() to convert and return
OPAL error codes into kernel error codes. This removes a previous
issue where callers to this function would have to include
'opal-api.h' to check for specific OPAL error codes.

References:
[1]: https://lists.ozlabs.org/pipermail/skiboot/2019-January/013063.html

Signed-off-by: Vaibhav Jain 
---
  arch/powerpc/platforms/powernv/pci-cxl.c | 71 +---
  1 file changed, 63 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-cxl.c 
b/arch/powerpc/platforms/powernv/pci-cxl.c
index 1b18111453d7..d33d662c6212 100644
--- a/arch/powerpc/platforms/powernv/pci-cxl.c
+++ b/arch/powerpc/platforms/powernv/pci-cxl.c
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 

  #include "pci.h"

@@ -18,21 +19,75 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode)
struct pci_controller *hose = pci_bus_to_host(dev->bus);
struct pnv_phb *phb = hose->private_data;
struct pnv_ioda_pe *pe;
+   unsigned long starttime, endtime;
int rc;

pe = pnv_ioda_get_pe(dev);
if (!pe)
-   return -ENODEV;
+   return -ENOENT;


The return code of pnv_phb_to_cxl_mode() is also returned by an api in 
the cxllib librarie. So, hoping that nobody test the value !!



+
+   pe_info(pe, "Switching PHB to CXL mode=%d\n", mode);
+
+   /*
+* Use a 15 second timeout for mode switch. Value arrived after
+* limited testing and may need more tweaking.
+*/
+   starttime = jiffies;
+   endtime = starttime + HZ * 15;
+
+   do {
+   rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode,
+  pe->pe_number);
+
+   /* Wait until mode transistion done */
+   if (rc != OPAL_BUSY && rc != OPAL_BUSY_EVENT)
+   break;
+
+   /* Check if we timedout */
+   if (time_after(jiffies, endtime)) {
+   rc = OPAL_TIMEOUT;
+   break;
+   }

-   pe_info(pe, "Switching PHB to CXL\n");
+   /* Opal Busy with mode switch. Run pci state-machine */
+   rc = opal_pci_poll(phb->opal_id);
+   if (rc >= 0) {
+   /* wait for some time */
+   if (rc > 0)
+   msleep(rc);
+   opal_poll_events(NULL);
+   rc = OPAL_BUSY;
+   /* Continue with the mode switch */
+   }
+   } while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT);
+
+   pe_level_printk(pe, KERN_DEBUG, "CXL mode switch finished in %u-msecs.",
+   jiffies_to_msecs(jiffies - starttime));

-   rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number);
-   if (rc == OPAL_UNSUPPORTED)
-   dev_err(&dev->dev, "Required cxl mode not supported by firmware - 
update skiboot\n");
-   else if (rc)
-   dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode failed: %i\n", 
rc);
+   /* Check OPAL errors and convert them to kernel error codes */
+   switch (rc) {
+   case OPAL_SUCCESS:
+   return 0;

-   return rc;
+   case OPAL_PARAMETER:
+   dev_err(&dev->dev, "CXL not supported on this PHB\n");
+   return -ENOENT;
+
+   case OPAL_UNSUPPORTED:
+   dev_err(&dev->dev,
+   "Required cxl mode not supported by firmware"
+   " - update skiboot\n");
+   return -ENODEV;
+
+   case OPAL_TIMEOUT:
+   dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode Timedout\n");
+   return -ETIME;
+
+   default:
+   dev_err(&dev->dev,
+   "opal_pci_set_phb_cxl_mode failed: %i\n", rc);
+   return -EIO;
+   };
  }
  EXPORT_SYMBOL(pnv_phb_to_cxl_mode);

[PATCH AUTOSEL 4.19 186/258] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index bac225bb7f64..23bea99bf8d5 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -63,7 +63,7 @@ static inline int __access_ok(unsigned long addr, unsigned 
long size,
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

[PATCH AUTOSEL 4.19 195/258] powerpc/perf: Fix thresholding counter data for unknown type

2019-01-28 Thread Sasha Levin

From: Madhavan Srinivasan 

[ Upstream commit 17cfccc91545682513541924245abb876d296063 ]

MMCRA[34:36] and MMCRA[38:44] expose the thresholding counter value.
Thresholding counter can be used to count latency cycles such as
load miss to reload. But threshold counter value is not relevant
when the sampled instruction type is unknown or reserved. Patch to
fix the thresholding counter value to zero when sampled instruction
type is unknown or reserved.

Fixes: 170a315f41c6('powerpc/perf: Support to export MMCRA[TEC*] field to 
userspace')
Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/isa207-common.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 177de814286f..6a2f65d3d088 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -226,8 +226,13 @@ void isa207_get_mem_weight(u64 *weight)
u64 mmcra = mfspr(SPRN_MMCRA);
u64 exp = MMCRA_THR_CTR_EXP(mmcra);
u64 mantissa = MMCRA_THR_CTR_MANT(mmcra);
+   u64 sier = mfspr(SPRN_SIER);
+   u64 val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT;
 
-   *weight = mantissa << (2 * exp);
+   if (val == 0 || val == 7)
+   *weight = 0;
+   else
+   *weight = mantissa << (2 * exp);
 }
 
 int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
-- 
2.19.1

[PATCH AUTOSEL 4.19 197/258] powerpc/powernv/ioda: Allocate indirect TCE levels of cached userspace addresses on demand

2019-01-28 Thread Sasha Levin

From: Alexey Kardashevskiy 

[ Upstream commit bdbf649efe21173cae63b4b71db84176420f9039 ]

The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
table and a table with userspace addresses; the latter is used for
marking pages dirty when corresponging TCEs are unmapped from
the hardware table.

a68bd1267b72 ("powerpc/powernv/ioda: Allocate indirect TCE levels
on demand") enabled on-demand allocation of the hardware table,
however it missed the other table so it has still been fully allocated
at the boot time. This fixes the issue by allocating a single level,
just like we do for the hardware table.

Fixes: a68bd1267b72 ("powerpc/powernv/ioda: Allocate indirect TCE levels on 
demand")
Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/pci-ioda-tce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c 
b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
index fe9691040f54..7639b2168755 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
@@ -299,7 +299,7 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 
bus_offset,
if (alloc_userspace_copy) {
offset = 0;
uas = pnv_pci_ioda2_table_do_alloc_pages(nid, level_shift,
-   levels, tce_table_size, &offset,
+   tmplevels, tce_table_size, &offset,
&total_allocated_uas);
if (!uas)
goto free_tces_exit;
-- 
2.19.1

[PATCH AUTOSEL 4.19 212/258] powerpc/mm: Fix reporting of kernel execute faults on the 8xx

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit ffca395b11c4a5a6df6d6345f794b0e3d578e2d0 ]

On the 8xx, no-execute is set via PPP bits in the PTE. Therefore
a no-exec fault generates DSISR_PROTFAULT error bits,
not DSISR_NOEXEC_OR_G.

This patch adds DSISR_PROTFAULT in the test mask.

Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index d51cf5f4e45e..365526ee29b8 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -221,7 +221,9 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr,
 static bool bad_kernel_fault(bool is_exec, unsigned long error_code,
 unsigned long address)
 {
-   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT))) {
+   /* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on 
others */
+   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
+ DSISR_PROTFAULT))) {
printk_ratelimited(KERN_CRIT "kernel tried to execute"
   " exec-protected page (%lx) -"
   "exploit attempt? (uid: %d)\n",
-- 
2.19.1

[PATCH AUTOSEL 4.19 216/258] powerpc/fadump: Do not allow hot-remove memory from fadump reserved area.

2019-01-28 Thread Sasha Levin

From: Mahesh Salgaonkar 

[ Upstream commit 0db6896ff6332ba694f1e61b93ae3b2640317633 ]

For fadump to work successfully there should not be any holes in reserved
memory ranges where kernel has asked firmware to move the content of old
kernel memory in event of crash. Now that fadump uses CMA for reserved
area, this memory area is now not protected from hot-remove operations
unless it is cma allocated. Hence, fadump service can fail to re-register
after the hot-remove operation, if hot-removed memory belongs to fadump
reserved region. To avoid this make sure that memory from fadump reserved
area is not hot-removable if fadump is registered.

However, if user still wants to remove that memory, he can do so by
manually stopping fadump service before hot-remove operation.

Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/fadump.h   |  2 +-
 arch/powerpc/kernel/fadump.c| 10 --
 arch/powerpc/platforms/pseries/hotplug-memory.c |  7 +--
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index 1e7a33592e29..15bc07a31c46 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -200,7 +200,7 @@ struct fad_crash_memory_ranges {
unsigned long long  size;
 };
 
-extern int is_fadump_boot_memory_area(u64 addr, ulong size);
+extern int is_fadump_memory_area(u64 addr, ulong size);
 extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
 extern int fadump_reserve_mem(void);
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index a711d22339ea..c02c95287a5f 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -118,13 +118,19 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 
 /*
  * If fadump is registered, check if the memory provided
- * falls within boot memory area.
+ * falls within boot memory area and reserved memory area.
  */
-int is_fadump_boot_memory_area(u64 addr, ulong size)
+int is_fadump_memory_area(u64 addr, ulong size)
 {
+   u64 d_start = fw_dump.reserve_dump_area_start;
+   u64 d_end = d_start + fw_dump.reserve_dump_area_size;
+
if (!fw_dump.dump_registered)
return 0;
 
+   if (((addr + size) > d_start) && (addr <= d_end))
+   return 1;
+
return (addr + size) > RMA_START && addr <= fw_dump.boot_memory_size;
 }
 
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index c1578f54c626..e4c658cda3a7 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -389,8 +389,11 @@ static bool lmb_is_removable(struct drmem_lmb *lmb)
phys_addr = lmb->base_addr;
 
 #ifdef CONFIG_FA_DUMP
-   /* Don't hot-remove memory that falls in fadump boot memory area */
-   if (is_fadump_boot_memory_area(phys_addr, block_sz))
+   /*
+* Don't hot-remove memory that falls in fadump boot memory area
+* and memory that is reserved for capturing old kernel memory.
+*/
+   if (is_fadump_memory_area(phys_addr, block_sz))
return false;
 #endif
 
-- 
2.19.1

[PATCH AUTOSEL 4.19 243/258] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 469541c1e51e..20907a0a043b 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1026,7 +1026,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

[PATCH AUTOSEL 4.14 016/170] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index e9149d05d30b..f4e6565dd7a9 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -284,6 +284,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.14 098/170] KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines

2019-01-28 Thread Sasha Levin

From: Suraj Jitindar Singh 

[ Upstream commit 693ac10a88a2219bde553b2e8460dbec97e594e6 ]

The kvm capability KVM_CAP_SPAPR_TCE_VFIO is used to indicate the
availability of in kernel tce acceleration for vfio. However it is
currently the case that this is only available on a powernv machine,
not for a pseries machine.

Thus make this capability dependent on having the cpu feature
CPU_FTR_HVMODE.

[pau...@ozlabs.org - fixed compilation for Book E.]

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/powerpc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index ecb45361095b..a35995a6b34a 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -540,8 +540,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #ifdef CONFIG_PPC_BOOK3S_64
case KVM_CAP_SPAPR_TCE:
case KVM_CAP_SPAPR_TCE_64:
-   /* fallthrough */
+   r = 1;
+   break;
case KVM_CAP_SPAPR_TCE_VFIO:
+   r = !!cpu_has_feature(CPU_FTR_HVMODE);
+   break;
case KVM_CAP_PPC_RTAS:
case KVM_CAP_PPC_FIXUP_HCALL:
case KVM_CAP_PPC_ENABLE_HCALL:
-- 
2.19.1

[PATCH AUTOSEL 4.14 113/170] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 565cead12be2..cf26e62b268d 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -54,7 +54,7 @@
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

[PATCH AUTOSEL 4.14 122/170] powerpc/perf: Fix thresholding counter data for unknown type

2019-01-28 Thread Sasha Levin

From: Madhavan Srinivasan 

[ Upstream commit 17cfccc91545682513541924245abb876d296063 ]

MMCRA[34:36] and MMCRA[38:44] expose the thresholding counter value.
Thresholding counter can be used to count latency cycles such as
load miss to reload. But threshold counter value is not relevant
when the sampled instruction type is unknown or reserved. Patch to
fix the thresholding counter value to zero when sampled instruction
type is unknown or reserved.

Fixes: 170a315f41c6('powerpc/perf: Support to export MMCRA[TEC*] field to 
userspace')
Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/isa207-common.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 2efee3f196f5..cf9c35aa0cf4 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -228,8 +228,13 @@ void isa207_get_mem_weight(u64 *weight)
u64 mmcra = mfspr(SPRN_MMCRA);
u64 exp = MMCRA_THR_CTR_EXP(mmcra);
u64 mantissa = MMCRA_THR_CTR_MANT(mmcra);
+   u64 sier = mfspr(SPRN_SIER);
+   u64 val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT;
 
-   *weight = mantissa << (2 * exp);
+   if (val == 0 || val == 7)
+   *weight = 0;
+   else
+   *weight = mantissa << (2 * exp);
 }
 
 int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
-- 
2.19.1

[PATCH AUTOSEL 4.14 134/170] powerpc/mm: Fix reporting of kernel execute faults on the 8xx

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit ffca395b11c4a5a6df6d6345f794b0e3d578e2d0 ]

On the 8xx, no-execute is set via PPP bits in the PTE. Therefore
a no-exec fault generates DSISR_PROTFAULT error bits,
not DSISR_NOEXEC_OR_G.

This patch adds DSISR_PROTFAULT in the test mask.

Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 6e1e39035380..52863deed65d 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -215,7 +215,9 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr, int fault)
 static bool bad_kernel_fault(bool is_exec, unsigned long error_code,
 unsigned long address)
 {
-   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT))) {
+   /* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on 
others */
+   if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
+ DSISR_PROTFAULT))) {
printk_ratelimited(KERN_CRIT "kernel tried to execute"
   " exec-protected page (%lx) -"
   "exploit attempt? (uid: %d)\n",
-- 
2.19.1

[PATCH AUTOSEL 4.14 138/170] powerpc/fadump: Do not allow hot-remove memory from fadump reserved area.

2019-01-28 Thread Sasha Levin

From: Mahesh Salgaonkar 

[ Upstream commit 0db6896ff6332ba694f1e61b93ae3b2640317633 ]

For fadump to work successfully there should not be any holes in reserved
memory ranges where kernel has asked firmware to move the content of old
kernel memory in event of crash. Now that fadump uses CMA for reserved
area, this memory area is now not protected from hot-remove operations
unless it is cma allocated. Hence, fadump service can fail to re-register
after the hot-remove operation, if hot-removed memory belongs to fadump
reserved region. To avoid this make sure that memory from fadump reserved
area is not hot-removable if fadump is registered.

However, if user still wants to remove that memory, he can do so by
manually stopping fadump service before hot-remove operation.

Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/fadump.h   |  2 +-
 arch/powerpc/kernel/fadump.c| 10 --
 arch/powerpc/platforms/pseries/hotplug-memory.c |  7 +--
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index 1e7a33592e29..15bc07a31c46 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -200,7 +200,7 @@ struct fad_crash_memory_ranges {
unsigned long long  size;
 };
 
-extern int is_fadump_boot_memory_area(u64 addr, ulong size);
+extern int is_fadump_memory_area(u64 addr, ulong size);
 extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
 extern int fadump_reserve_mem(void);
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 5a6470383ca3..62d7ef6508de 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -117,13 +117,19 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 
 /*
  * If fadump is registered, check if the memory provided
- * falls within boot memory area.
+ * falls within boot memory area and reserved memory area.
  */
-int is_fadump_boot_memory_area(u64 addr, ulong size)
+int is_fadump_memory_area(u64 addr, ulong size)
 {
+   u64 d_start = fw_dump.reserve_dump_area_start;
+   u64 d_end = d_start + fw_dump.reserve_dump_area_size;
+
if (!fw_dump.dump_registered)
return 0;
 
+   if (((addr + size) > d_start) && (addr <= d_end))
+   return 1;
+
return (addr + size) > RMA_START && addr <= fw_dump.boot_memory_size;
 }
 
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 1d48ab424bd9..93e09f108ca1 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -441,8 +441,11 @@ static bool lmb_is_removable(struct of_drconf_cell *lmb)
phys_addr = lmb->base_addr;
 
 #ifdef CONFIG_FA_DUMP
-   /* Don't hot-remove memory that falls in fadump boot memory area */
-   if (is_fadump_boot_memory_area(phys_addr, block_sz))
+   /*
+* Don't hot-remove memory that falls in fadump boot memory area
+* and memory that is reserved for capturing old kernel memory.
+*/
+   if (is_fadump_memory_area(phys_addr, block_sz))
return false;
 #endif
 
-- 
2.19.1

[PATCH AUTOSEL 4.14 159/170] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 0d7527c6825a..2f7acdb830c3 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1027,7 +1027,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

[PATCH AUTOSEL 4.9 011/107] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 72ae2cdbcd6a..999b04819d69 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -288,6 +288,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.9 064/107] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index c266227fdd5b..31913b3ac7ab 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -59,7 +59,7 @@
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

Re: use generic DMA mapping code in powerpc V4

2019-01-28 Thread Christoph Hellwig

On Mon, Jan 28, 2019 at 08:04:22AM +0100, Christoph Hellwig wrote:
> On Sun, Jan 27, 2019 at 02:13:09PM +0100, Christian Zigotzky wrote:
> > Christoph,
> >
> > What shall I do next?
> 
> I'll need to figure out what went wrong with the new zone selection
> on powerpc and give you another branch to test.

Can you try the new powerpc-dma.6-debug.2 branch:

git://git.infradead.org/users/hch/misc.git powerpc-dma.6-debug.2

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.6-debug.2

[PATCH AUTOSEL 4.9 100/107] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index c264f2d284a7..2e0a9e2531cb 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1027,7 +1027,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

[PATCH AUTOSEL 4.4 10/80] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 96536c969c9c..a8efed3b4691 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -280,6 +280,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.4 48/80] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index a5ffe0207c16..05f1389228d2 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -59,7 +59,7 @@
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

[PATCH AUTOSEL 4.4 76/80] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index c264f2d284a7..2e0a9e2531cb 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1027,7 +1027,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

[PATCH AUTOSEL 3.18 07/61] powerpc/pseries: add of_node_put() in dlpar_detach_node()

2019-01-28 Thread Sasha Levin

From: Frank Rowand 

[ Upstream commit 5b3f5c408d8cc59b87e47f1ab9803dbd006e4a91 ]

The previous commit, "of: overlay: add missing of_node_get() in
__of_attach_node_sysfs" added a missing of_node_get() to
__of_attach_node_sysfs().  This results in a refcount imbalance
for nodes attached with dlpar_attach_node().  The calling sequence
from dlpar_attach_node() to __of_attach_node_sysfs() is:

   dlpar_attach_node()
  of_attach_node()
 __of_attach_node_sysfs()

For more detailed description of the node refcount, see
commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow
during DLPAR remove").

Tested-by: Alan Tull 
Acked-by: Michael Ellerman 
Signed-off-by: Frank Rowand 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 80d175dca68e..85bd29475987 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -299,6 +299,8 @@ int dlpar_detach_node(struct device_node *dn)
if (rc)
return rc;
 
+   of_node_put(dn);
+
return 0;
 }
 
-- 
2.19.1

[PATCH AUTOSEL 3.18 36/61] powerpc/uaccess: fix warning/error with access_ok()

2019-01-28 Thread Sasha Levin

From: Christophe Leroy 

[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ]

With the following piece of code, the following compilation warning
is encountered:

if (_IOC_DIR(ioc) != _IOC_NONE) {
int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : 
VERIFY_READ;

if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) {

drivers/platform/test/dev.c: In function 'my_ioctl':
drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' 
[-Wunused-variable]
   int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ;

This patch fixes it by referencing 'type' in the macro allthough
doing nothing with it.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 46c486599645..1bb8cc3fbbf1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -59,7 +59,7 @@
 #endif
 
 #define access_ok(type, addr, size)\
-   (__chk_user_ptr(addr),  \
+   (__chk_user_ptr(addr), (void)(type),\
 __access_ok((__force unsigned long)(addr), (size), get_fs()))
 
 /*
-- 
2.19.1

[PATCH AUTOSEL 3.18 58/61] block/swim3: Fix -EBUSY error when re-opening device after unmount

2019-01-28 Thread Sasha Levin

From: Finn Thain 

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson 
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/swim3.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 523ee8fd4c15..eaf1336623aa 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1027,7 +1027,11 @@ static void floppy_release(struct gendisk *disk, fmode_t 
mode)
struct swim3 __iomem *sw = fs->swim3;
 
mutex_lock(&swim3_mutex);
-   if (fs->ref_count > 0 && --fs->ref_count == 0) {
+   if (fs->ref_count > 0)
+   --fs->ref_count;
+   else if (fs->ref_count == -1)
+   fs->ref_count = 0;
+   if (fs->ref_count == 0) {
swim3_action(fs, MOTOR_OFF);
out_8(&sw->control_bic, 0xff);
swim3_select(fs, RELAX);
-- 
2.19.1

Re: use generic DMA mapping code in powerpc V4

2019-01-28 Thread Christian Zigotzky

Thanks a lot! I will test it tomorrow.

— Christian

Sent from my iPhone

> On 28. Jan 2019, at 17:22, Christoph Hellwig  wrote:
> 
>> On Mon, Jan 28, 2019 at 08:04:22AM +0100, Christoph Hellwig wrote:
>>> On Sun, Jan 27, 2019 at 02:13:09PM +0100, Christian Zigotzky wrote:
>>> Christoph,
>>> 
>>> What shall I do next?
>> 
>> I'll need to figure out what went wrong with the new zone selection
>> on powerpc and give you another branch to test.
> 
> Can you try the new powerpc-dma.6-debug.2 branch:
> 
>git://git.infradead.org/users/hch/misc.git powerpc-dma.6-debug.2
> 
> Gitweb:
> 
>
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.6-debug.2

[PATCH] powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask

2019-01-28 Thread Reza Arbab

In htab_convert_pte_flags(), _PAGE_CACHE_CTL is used to check for the
_PAGE_SAO flag:

  else if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_SAO)
  rflags |= (HPTE_R_W | HPTE_R_I | HPTE_R_M);

But, it isn't defined to include that flag:

  #define _PAGE_CACHE_CTL (_PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT)

This happens to work, but only because of the flag values:

  #define _PAGE_SAO   0x00010 /* Strong access order */
  #define _PAGE_NON_IDEMPOTENT0x00020 /* non idempotent memory */
  #define _PAGE_TOLERANT  0x00030 /* tolerant memory, cache inhibited */

To prevent any issues if these particulars ever change, add _PAGE_SAO to
the mask.

Suggested-by: Charles Johns 
Signed-off-by: Reza Arbab 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 2e6ada2..1d97a28 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -811,7 +811,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
return hash__set_pte_at(mm, addr, ptep, pte, percpu);
 }
 
-#define _PAGE_CACHE_CTL(_PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT)
+#define _PAGE_CACHE_CTL(_PAGE_SAO | _PAGE_NON_IDEMPOTENT | 
_PAGE_TOLERANT)
 
 #define pgprot_noncached pgprot_noncached
 static inline pgprot_t pgprot_noncached(pgprot_t prot)
-- 
1.8.3.1

Re: [PATCH 05/19] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode

2019-01-28 Thread Cédric Le Goater

On 1/22/19 6:05 AM, Paul Mackerras wrote:
> On Mon, Jan 07, 2019 at 07:43:17PM +0100, Cédric Le Goater wrote:
>> This is the basic framework for the new KVM device supporting the XIVE
>> native exploitation mode. The user interface exposes a new capability
>> and a new KVM device to be used by QEMU.
> 
> [snip]
>> @@ -1039,7 +1039,10 @@ static int kvmppc_book3s_init(void)
>>  #ifdef CONFIG_KVM_XIVE
>>  if (xive_enabled()) {
>>  kvmppc_xive_init_module();
>> +kvmppc_xive_native_init_module();
>>  kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS);
>> +kvm_register_device_ops(&kvm_xive_native_ops,
>> +KVM_DEV_TYPE_XIVE);
> 
> I think we want tighter conditions on initializing the xive_native
> stuff and creating the xive device class.  We could have
> xive_enabled() returning true in a guest, and this code will get
> called both by PR KVM and HV KVM (and HV KVM no longer implies that we
> are running bare metal).

So yes, I gave nested a try with kernel_irqchip=on and the nested hypervisor 
(L1) obviously crashes trying to call OPAL. I have tighten the test with : 

if (xive_enabled() && !kvmhv_on_pseries()) {

for now.

As this is a problem today in 5.0.x, I will send a patch for it if you think
it is correct. I don't think we should bother taking care of the PR case
on P9. Should we ? 

Thanks,

C.
 
>> @@ -1050,8 +1053,10 @@ static int kvmppc_book3s_init(void)
>>  static void kvmppc_book3s_exit(void)
>>  {
>>  #ifdef CONFIG_KVM_XICS
>> -if (xive_enabled())
>> +if (xive_enabled()) {
>>  kvmppc_xive_exit_module();
>> +kvmppc_xive_native_exit_module();
> 
> Same comment here.
> 
> Paul.
>

What is CONFIG_RTAS ? Which CPUs are concerned

2019-01-28 Thread Christophe Leroy


Hello All,

I'm wondering what CONFIG_RTAS is. It makes use of one of the SPRN_SPRG, 
ie SPRN_SPRG2.


What are the CPUs concerned by RTAS ? Is there any of the old CPUs which 
have only 4 SPRGs (eg the 601), or could we use one in SPRG4-7 for it 
and reuse SPRG2 for something else ?


The idea behind this question is to store physical address of PGDIR in 
SPRG2 and then put virtual address of thread_struct instead of its 
physical address in SPRG3, especially for when CONFIG_VMAP_STACK is set.


Thanks
Christophe

Re: [RFC 1/6] powerpc:/drc Define interface to acquire arch-specific drc info

2019-01-28 Thread Michael Bringmann

On 1/25/19 10:09 AM, Michael Bringmann wrote:
> Adding Nathan Lynch
> 
> On 1/24/19 6:04 PM, Tyrel Datwyler wrote:
>> On 12/14/2018 12:50 PM, Michael Bringmann wrote:
>>> Define interface to acquire arch-specific drc info to match against
>>> hotpluggable devices.  The current implementation exposes several
>>> pseries-specific dynamic memory properties in generic kernel code.
>>> This patch set provides an interface to pull that code out of the
>>> generic kernel.
>>>
>>> Signed-off-by: Michael Bringmann 
>>> ---
>>>  include/linux/topology.h |9 +
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/include/linux/topology.h b/include/linux/topology.h
>>> index cb0775e..df97f5f 100644
>>> --- a/include/linux/topology.h
>>> +++ b/include/linux/topology.h
>>> @@ -44,6 +44,15 @@
>>
>> As far as I know pseries is the only platform that uses DR connectors, and I
>> highly doubt that any other powerpc platform or arch ever will. So, I'm not 
>> sure
>> that this is really generic enough to belong in topology.h. If anything I 
>> would
>> suggest putting this in an include in arch/powerpc/include/ named something 
>> like
>> drcinfo.h or pseries-drc.h. That will make it visible to modules like rpaphp
>> that want/need to use this functionality.

It looks like the 'rpaphp' and 'rpadlpar_io' modules are also dependent upon the
powerpc platform.  Shouldn't the relevant source files be moved completely to 
the
powerpc-specific directories out of drivers/pci/hotplug as well?

drivers/pci/hotplug/Kconfig has:

config HOTPLUG_PCI_RPA
tristate "RPA PCI Hotplug driver"
depends on PPC_PSERIES && EEH
help
  Say Y here if you have a RPA system that supports PCI Hotplug.

  To compile this driver as a module, choose M here: the
  module will be called rpaphp.

  When in doubt, say N.

config HOTPLUG_PCI_RPA_DLPAR
tristate "RPA Dynamic Logical Partitioning for I/O slots"
depends on HOTPLUG_PCI_RPA
help
  Say Y here if your system supports Dynamic Logical Partitioning
  for I/O slots.

  To compile this driver as a module, choose M here: the
  module will be called rpadlpar_io.

  When in doubt, say N.

Michael

>>
>> -Tyrel
>>
>>>  
>>>  int arch_update_cpu_topology(void);
>>>  
>>> +int arch_find_drc_match(struct device_node *dn,
>>> +   bool (*usercb)(struct device_node *dn,
>>> +   u32 drc_index, char *drc_name,
>>> +   char *drc_type, u32 drc_power_domain,
>>> +   void *data),
>>> +   char *opt_drc_type, char *opt_drc_name,
>>> +   bool match_drc_index, bool ck_php_type,
>>> +   void *data);
>>> +
>>>  /* Conform to ACPI 2.0 SLIT distance definitions */
>>>  #define LOCAL_DISTANCE 10
>>>  #define REMOTE_DISTANCE20
>>>
>>
>>
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com

Re: What is CONFIG_RTAS ? Which CPUs are concerned

2019-01-28 Thread Segher Boessenkool

On Mon, Jan 28, 2019 at 07:20:43PM +0100, Christophe Leroy wrote:
> I'm wondering what CONFIG_RTAS is. It makes use of one of the SPRN_SPRG, 
> ie SPRN_SPRG2.
> 
> What are the CPUs concerned by RTAS ? Is there any of the old CPUs which 
> have only 4 SPRGs (eg the 601), or could we use one in SPRG4-7 for it 
> and reuse SPRG2 for something else ?

RTAS (run-time abstraction services) is as old as PowerPC itself.  Yes there
is RTAS on various 6xx, and those do not have any SPRGs not defined in the
architecture.

RTAS is a feature of the firmware, or of the platform you could say.  Not a
feature of CPUs.

Segher

Re: [PATCH] powerpc/kernel/time: Remove duplicate header

2019-01-28 Thread Souptick Joarder

On Mon, Jan 28, 2019 at 9:41 PM Brajeswar Ghosh
 wrote:
>
> Remove linux/rtc.h which is included more than once
>
> Signed-off-by: Brajeswar Ghosh 

Acked-by: Souptick Joarder 

> ---
>  arch/powerpc/kernel/time.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> index 3646affae963..bc0503ef9c9c 100644
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -57,7 +57,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> --
> 2.17.1
>

Re: [PATCH v2 1/5] drivers/accel: Introduce subsystem

2019-01-28 Thread Frederic Barrat





Le 27/01/2019 à 05:31, Andrew Donnellan a écrit :
[+ linuxppc-dev, because cxl/ocxl are handled through powerpc - please 
cc on future versions of this series]


On 26/1/19 8:13 am, Olof Johansson wrote:

We're starting to see more of these kind of devices, the current
upcoming wave will likely be around machine learning and inference
engines. A few drivers have been added to drivers/misc for this, but
it's timely to make it into a separate group of drivers/subsystem, to
make it easier to find them, and to encourage collaboration between
contributors.

Over time, we expect to build shared frameworks that the drivers will
make use of, but how that framework needs to look like to fill the needs
is still unclear, and the best way to gain that knowledge is to give the
disparate implementations a shared location.

There has been some controversy around expectations for userspace
stacks being open. The clear preference is to see that happen, and any
driver and platform stack that is delivered like that will be given
preferential treatment, and at some point in the future it might
become the requirement. Until then, the bare minimum we need is an
open low-level userspace such that the driver and HW interfaces can be
exercised if someone is modifying the driver, even if the full details
of the workload are not always available.

Bootstrapping this with myself and Greg as maintainers (since the current
drivers will be moving out of drivers/misc). Looking forward to expanding
that group over time.



[snip]


+
+Hardware offload accelerator subsystem
+==
+
+This is a brief overview of the subsystem (grouping) of hardware
+accelerators kept under drivers/accel
+
+Types of hardware supported
+---
+
+  The general types of hardware supported are hardware devices that has
+  general interactions of sending commands and buffers to the hardware,
+  returning completions and possible filled buffers back, together
+  with the usual driver pieces around hardware control, setup, error
+  handling, etc.
+
+  Drivers that fit into other subsystems are expected to be merged
+  there, and use the appropriate userspace interfaces of said functional
+  areas. We don't expect to see drivers for network, storage, graphics
+  and similar hardware implemented by drivers here.
+
+Expectations for contributions
+--
+
+ - Platforms and hardware that has fully open stacks, from Firmware to
+   Userspace, are always going to be given preferential treatment. These
+   platforms give the best insight for behavior and interaction of all
+   layers, including ability to improve implementation across the stack
+   over time.
+
+ - If a platform is partially proprietary, it is still expected that the
+   portions that interact the driver can be shared in a form that allows
+   for exercising the hardware/driver and evolution of the interface 
over

+   time. This could be separated into a shared library and test/sample
+   programs, for example.
+
+ - Over time, there is an expectation to converge drivers over to shared
+   frameworks and interfaces. Until then, the general rule is that no
+   more than one driver per vendor will be acceptable. For vendors that
+   aren't participating in the work towards shared frameworks over time,
+   we reserve the right to phase out support for the hardware.
How exactly do generic drivers for interconnect protocols, such as 
cxl/ocxl, fit in here?


cxl and ocxl are not drivers for a specific device, they are generic 
drivers which can be used with any device implementing the CAPI or 
OpenCAPI protocol respectively - many of which will be FPGA boards 
flashed with customer-designed accelerator cores for specific workloads, 
some will be accelerators using ASICs or using FPGA images supplied by 
vendors, some will be driven from userspace, others using the cxl/ocxl 
kernel API, etc.



I have the same reservation as Andrew. While my first reaction was to 
think that cxl and ocxl should be part of the accel subsystem, they 
hardly seem to fit the stated goals.
Furthermore, there are implications there, as all the distros currently 
shipping cxl and ocxl as modules on powerpc would need to have their 
config modified to enable CONFIG_ACCEL.


  Fred

Re: [PATCH 18/19] KVM: PPC: Book3S HV: add passthrough support

2019-01-28 Thread Cédric Le Goater

On 1/28/19 7:13 AM, Paul Mackerras wrote:
> On Wed, Jan 23, 2019 at 12:07:19PM +0100, Cédric Le Goater wrote:
>> On 1/23/19 11:30 AM, Paul Mackerras wrote:
>>> On Wed, Jan 23, 2019 at 05:45:24PM +1100, Benjamin Herrenschmidt wrote:
 On Tue, 2019-01-22 at 16:26 +1100, Paul Mackerras wrote:
> On Mon, Jan 07, 2019 at 08:10:05PM +0100, Cédric Le Goater wrote:
>> Clear the ESB pages from the VMA of the IRQ being pass through to the
>> guest and let the fault handler repopulate the VMA when the ESB pages
>> are accessed for an EOI or for a trigger.
>
> Why do we want to do this?
>
> I don't see any possible advantage to removing the PTEs from the
> userspace mapping.  You'll need to explain further.

 Afaik bcs we change the mapping to point to the real HW irq ESB page
 instead of the "IPI" that was there at VM init time.
>>
>> yes exactly. You need to clean up the pages each time.
>>  
>>> So that makes it sound like there is a whole lot going on that hasn't
>>> even been hinted at in the patch descriptions...  It sounds like we
>>> need a good description of how all this works and fits together
>>> somewhere under Documentation/.
>>
>> OK. I have started doing so for the models merged in QEMU but not yet 
>> for KVM. I will work on it.
>>
>>> In any case we need much more informative patch descriptions.  I
>>> realize that it's all currently in Cedric's head, but I bet that in
>>> two or three years' time when we come to try to debug something, it
>>> won't be in anyone's head...
>>
>> I agree. 
>>
>>
>> So, storing the ESB VMA under the KVM device is not shocking anyone ?  
> 
> Actually, now that I think of it, why can't userspace (QEMU) manage
> this using mmap()?  Based on what Ben has said, I assume there would
> be a pair of pages for each interrupt that a PCI pass-through device
> has. 

Yes. there is a pair of ESB pages per IRQ number.

> Would we end up with too many VMAs if we just used mmap() to
> change the mappings from the software-generated pages to the
> hardware-generated interrupt pages?  
The sPAPR IRQ number space is 0x8000 wide now. The first 4K are 
dedicated to CPU IPIs and the remaining 4K are for devices. We can 
extend the last range if needed as these are for MSIs. Dynamic 
extensions under KVM should work also.

This to say that we have with 8K x 2 (trigger+EOI) pages. This is a
lot of mmap(), too much. Also the KVM model needs to be compatible
with the QEMU emulated one and it was simpler to have one overall
memory region for the IPI ESBs, one for the END ESBs (if we support
that one day) and one for the TIMA.

> Are the necessary pages for a PCI
> passthrough device contiguous in both host real space 

They should as they are the PHB4 ESBs.

> and guest real space ? 

also. That's how we organized the mapping. 

> If so we'd only need one mmap() for all the device's interrupt
> pages.

Ah. So we would want to make a special case for the passthrough 
device and have a mmap() and a memory region for its ESBs. Hmm.

Wouldn't that require to hot plug a memory region under the guest ? 
which means that we need to provision an address space/container 
region for theses regions. What are the benefits ? 

Is clearing the PTEs and repopulating the VMA unsafe ? 

Thanks, 

C.

[RFC PATCH] powerpc: fix get_arch_dma_ops() for NTB devices

2019-01-28 Thread Alexander Fomichev

get_dma_ops() falls into arch-dependant get_arch_dma_ops(), which
historically returns NULL on PowerPC. Therefore dma_set_mask() fails.
This affects Switchtec (and probably other) NTB devices, that they fail
to initialize.
The proposed patch should fix the issue.

---
 arch/powerpc/include/asm/dma-mapping.h | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index ebf6680..cb6ac96 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -70,14 +70,11 @@ extern struct dma_map_ops dma_iommu_ops;
 #endif
 extern const struct dma_map_ops dma_nommu_ops;
 
+extern const struct dma_map_ops *get_pci_dma_ops(void);
+
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
-   /* We don't handle the NULL dev case for ISA for now. We could
-* do it via an out of line call but it is not needed for now. The
-* only ISA DMA device we support is the floppy and we have a hack
-* in the floppy driver directly to get a device for us.
-*/
-   return NULL;
+   return get_pci_dma_ops();
 }
 
 /*
-- 
2.7.4

[PATCH] configs: Get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED

2019-01-28 Thread Alexey Brodkin

This Kconfig option was removed during v4.19 development in
commit 771c035372a0 ("deprecate the '__deprecated' attribute warnings entirely 
and for good")
so there's no point to keep it in defconfigs any longer.

FWIW defconfigs were patched with:
--->8--
find . -name *_defconfig -exec sed -i '/CONFIG_ENABLE_WARN_DEPRECATED/d' {} \;
--->8--

Signed-off-by: Alexey Brodkin 
Cc: Jonathan Corbet 
Cc: Federico Vaga 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Florian Fainelli 
Cc: Ray Jui 
Cc: Scott Branden 
Cc: bcm-kernel-feedback-l...@broadcom.com
Cc: Eric Anholt 
Cc: Stefan Wahren 
Cc: "Uwe Kleine-Konig" 
Cc: Vladimir Zapolskiy 
Cc: Liviu Dudau 
Cc: Sudeep Holla 
Cc: Lorenzo Pieralisi 
Cc: Maxime Coquelin 
Cc: Alexandre Torgue 
Cc: Yoshinori Sato 
Cc: Geert Uytterhoeven 
Cc: Ley Foon Tan 
Cc: Jonas Bonn 
Cc: Stefan Kristiansson 
Cc: Stafford Horne 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Rich Felker 
Cc: "David S. Miller" 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: Miguel Ojeda 
Cc: Andrew Morton 
Cc: Alessia Mantegazza 
Cc: Kevin Hilman 
Cc: Eugeniy Paltsev 
Cc: Anders Roxell 
Cc: Linus Walleij 
Cc: Arnd Bergmann 
Cc: Patrice Chotard 
Cc: Krzysztof Kozlowski 
Cc: Bjorn Helgaas 
Cc: Paul Burton 
Cc: Adam Borowski 
Cc: linux-...@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-rpi-ker...@lists.infradead.org
Cc: linux-st...@st-md-mailman.stormreply.com
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-m...@lists.linux-m68k.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
---
 Documentation/process/4.Coding.rst| 2 +-
 Documentation/translations/it_IT/process/4.Coding.rst | 2 +-
 arch/arc/configs/axs101_defconfig | 1 -
 arch/arc/configs/axs103_defconfig | 1 -
 arch/arc/configs/axs103_smp_defconfig | 1 -
 arch/arc/configs/haps_hs_defconfig| 1 -
 arch/arc/configs/haps_hs_smp_defconfig| 1 -
 arch/arc/configs/hsdk_defconfig   | 1 -
 arch/arc/configs/nps_defconfig| 1 -
 arch/arc/configs/nsim_700_defconfig   | 1 -
 arch/arc/configs/nsim_hs_defconfig| 1 -
 arch/arc/configs/nsim_hs_smp_defconfig| 1 -
 arch/arc/configs/nsimosci_defconfig   | 1 -
 arch/arc/configs/nsimosci_hs_defconfig| 1 -
 arch/arc/configs/nsimosci_hs_smp_defconfig| 1 -
 arch/arc/configs/tb10x_defconfig  | 1 -
 arch/arc/configs/vdk_hs38_defconfig   | 1 -
 arch/arc/configs/vdk_hs38_smp_defconfig   | 1 -
 arch/arm/configs/bcm2835_defconfig| 1 -
 arch/arm/configs/cns3420vb_defconfig  | 1 -
 arch/arm/configs/efm32_defconfig  | 1 -
 arch/arm/configs/eseries_pxa_defconfig| 1 -
 arch/arm/configs/gemini_defconfig | 1 -
 arch/arm/configs/lpc18xx_defconfig| 1 -
 arch/arm/configs/mini2440_defconfig   | 1 -
 arch/arm/configs/moxart_defconfig | 1 -
 arch/arm/configs/mps2_defconfig   | 1 -
 arch/arm/configs/nuc910_defconfig | 1 -
 arch/arm/configs/nuc950_defconfig | 1 -
 arch/arm/configs/nuc960_defconfig | 1 -
 arch/arm/configs/stm32_defconfig  | 1 -
 arch/h8300/configs/edosk2674_defconfig| 1 -
 arch/h8300/configs/h8300h-sim_defconfig   | 1 -
 arch/h8300/configs/h8s-sim_defconfig  | 1 -
 arch/m68k/configs/amcore_defconfig| 1 -
 arch/m68k/configs/stmark2_defconfig   | 1 -
 arch/nios2/configs/10m50_defconfig| 1 -
 arch/nios2/configs/3c120_defconfig| 1 -
 arch/openrisc/configs/or1ksim_defconfig   | 1 -
 arch/openrisc/configs/simple_smp_defconfig| 1 -
 arch/powerpc/configs/mpc512x_defconfig| 1 -
 arch/powerpc/configs/ppc6xx_defconfig | 1 -
 arch/sh/configs/apsh4a3a_defconfig| 1 -
 arch/sh/configs/edosk7705_defconfig   | 1 -
 arch/sh/configs/espt_defconfig| 1 -
 arch/sh/configs/sdk7786_defconfig | 1 -
 arch/sh/configs/sh2007_defconfig  | 1 -
 arch/sh/configs/sh7724_generic_defconfig  | 1 -
 arch/sh/configs/sh7763rdp_defconfig   | 1 -
 arch/sh/configs/sh7770_generic_defconfig  | 1 -
 arch/sh/configs/sh7785lcr_defconfig   | 1 -
 arch/sh/configs/ul2_defconfig

[PATCH] powerpc/kernel/time: Remove duplicate header

2019-01-28 Thread Brajeswar Ghosh

Remove linux/rtc.h which is included more than once

Signed-off-by: Brajeswar Ghosh 
---
 arch/powerpc/kernel/time.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 3646affae963..bc0503ef9c9c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -57,7 +57,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.17.1

Re: [PATCH] ucc_geth: Reset BQL queue when stopping device

2019-01-28 Thread Li Yang

On Mon, Jan 28, 2019 at 8:37 AM Mathias Thore
 wrote:
>
> Hi,
>
>
> This is what we observed: there was a storm on the medium so that our 
> controller could not do its TX, resulting in timeout. When timeout occurs, 
> the driver clears all descriptors from the TX queue. The function called in 
> this patch is used to reflect this clearing also in the BQL layer. Without 
> it, the controller would get stuck, unable to perform TX, even several 
> minutes after the storm had ended. Bringing the device down and then up again 
> would solve the problem, but this patch also solves it automatically.

The explanation makes sense.  So this should only be required in the
timeout scenario instead of other clean up scenarios like device
shutdown?  If so, it probably it will be better to be done in
ucc_geth_timeout_work()?

>
>
> Some other drivers do the same, for example e1000e driver calls 
> netdev_reset_queue in its e1000_clean_tx_ring function. It is possible that 
> other drivers should do the same; I have no way of verifying this.
>
>
> Regards,
>
> Mathias
>
> --
>
>
> From: Christophe Leroy 
> Sent: Monday, January 28, 2019 10:48 AM
> To: Mathias Thore; leoyang...@nxp.com; net...@vger.kernel.org; 
> linuxppc-dev@lists.ozlabs.org; David Gounaris; Joakim Tjernlund
> Subject: Re: [PATCH] ucc_geth: Reset BQL queue when stopping device
>
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
>
>
> Hi,
>
> Le 28/01/2019 à 10:07, Mathias Thore a écrit :
> > After a timeout event caused by for example a broadcast storm, when
> > the MAC and PHY are reset, the BQL TX queue needs to be reset as
> > well. Otherwise, the device will exhibit severe performance issues
> > even after the storm has ended.
>
> What are the symptomns ?
>
> Is this reset needed on any network driver in that case, or is it
> something particular for the ucc_geth ?
> For instance, the freescale fs_enet doesn't have that reset. Should it
> have it too ?
>
> Christophe
>
> >
> > Co-authored-by: David Gounaris 
> > Signed-off-by: Mathias Thore 
> > ---
> >   drivers/net/ethernet/freescale/ucc_geth.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
> > b/drivers/net/ethernet/freescale/ucc_geth.c
> > index c3d539e209ed..eb3e65e8868f 100644
> > --- a/drivers/net/ethernet/freescale/ucc_geth.c
> > +++ b/drivers/net/ethernet/freescale/ucc_geth.c
> > @@ -1879,6 +1879,8 @@ static void ucc_geth_free_tx(struct ucc_geth_private 
> > *ugeth)
> >   u16 i, j;
> >   u8 __iomem *bd;
> >
> > + netdev_reset_queue(ugeth->ndev);
> > +
> >   ug_info = ugeth->ug_info;
> >   uf_info = &ug_info->uf_info;
> >
> >
>

[patch-next] KVM: PPC: Book3S HV: Use kzalloc_node

2019-01-28 Thread Christopher Diaz Riveros

Fixes coccinelle warning:

/arch/powerpc/kvm/book3s_hv.c:5345:3-15: WARNING: kzalloc_node should be used 
for sibling_subcore_state, instead of kmalloc_node/memset

Signed-off-by: Christopher Diaz Riveros 
---
 arch/powerpc/kvm/book3s_hv.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5a066fc299e1..fc59bfa892c9 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -5342,14 +5342,11 @@ static int kvm_init_subcore_bitmap(void)
continue;
 
sibling_subcore_state =
-   kmalloc_node(sizeof(struct sibling_subcore_state),
+   kzalloc_node(sizeof(struct sibling_subcore_state),
GFP_KERNEL, node);
if (!sibling_subcore_state)
return -ENOMEM;
 
-   memset(sibling_subcore_state, 0,
-   sizeof(struct sibling_subcore_state));
-
for (j = 0; j < threads_per_core; j++) {
int cpu = first_cpu + j;
 
-- 
2.20.1

Re: [PATCH] powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask

2019-01-28 Thread Alexey Kardashevskiy




On 29/01/2019 04:31, Reza Arbab wrote:
> In htab_convert_pte_flags(), _PAGE_CACHE_CTL is used to check for the
> _PAGE_SAO flag:
> 
>   else if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_SAO)
>   rflags |= (HPTE_R_W | HPTE_R_I | HPTE_R_M);
> 
> But, it isn't defined to include that flag:
> 
>   #define _PAGE_CACHE_CTL (_PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT)
> 
> This happens to work, but only because of the flag values:
> 
>   #define _PAGE_SAO   0x00010 /* Strong access order */
>   #define _PAGE_NON_IDEMPOTENT0x00020 /* non idempotent memory */
>   #define _PAGE_TOLERANT  0x00030 /* tolerant memory, cache inhibited 
> */
> 
> To prevent any issues if these particulars ever change, add _PAGE_SAO to
> the mask.


This does not feel right, doing

#define _PAGE_CACHE_CTL 0x30

would make more sense as SAO/NI/TOLERANT is enum so applying "|" to them
just seems wrong.


> 
> Suggested-by: Charles Johns 
> Signed-off-by: Reza Arbab 
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 2e6ada2..1d97a28 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -811,7 +811,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
> unsigned long addr,
>   return hash__set_pte_at(mm, addr, ptep, pte, percpu);
>  }
>  
> -#define _PAGE_CACHE_CTL  (_PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT)
> +#define _PAGE_CACHE_CTL  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | 
> _PAGE_TOLERANT)
>  
>  #define pgprot_noncached pgprot_noncached
>  static inline pgprot_t pgprot_noncached(pgprot_t prot)
> 

-- 
Alexey

Re: [PATCH 18/19] KVM: PPC: Book3S HV: add passthrough support

2019-01-28 Thread Paul Mackerras

On Mon, Jan 28, 2019 at 07:26:00PM +0100, Cédric Le Goater wrote:
> On 1/28/19 7:13 AM, Paul Mackerras wrote:
> > Would we end up with too many VMAs if we just used mmap() to
> > change the mappings from the software-generated pages to the
> > hardware-generated interrupt pages?  
> The sPAPR IRQ number space is 0x8000 wide now. The first 4K are 
> dedicated to CPU IPIs and the remaining 4K are for devices. We can 

Confused.  You say the number space has 32768 entries but then imply
there are only 8K entries.  Do you mean that the API allows for 15-bit
IRQ numbers but we are only making using of 8192 of them?

> extend the last range if needed as these are for MSIs. Dynamic 
> extensions under KVM should work also.
> 
> This to say that we have with 8K x 2 (trigger+EOI) pages. This is a
> lot of mmap(), too much. Also the KVM model needs to be compatible

I wasn't suggesting an mmap per IRQ, I meant that the bulk of the
space would be covered by a single mmap, overlaid by subsequent mmaps
where we need to map real device interrupts.

> with the QEMU emulated one and it was simpler to have one overall
> memory region for the IPI ESBs, one for the END ESBs (if we support
> that one day) and one for the TIMA.
> 
> > Are the necessary pages for a PCI
> > passthrough device contiguous in both host real space 
> 
> They should as they are the PHB4 ESBs.
> 
> > and guest real space ? 
> 
> also. That's how we organized the mapping. 

"How we organized the mapping" is a significant design decision that I
haven't seen documented anywhere, and is really needed for
understanding what's going on.

> 
> > If so we'd only need one mmap() for all the device's interrupt
> > pages.
> 
> Ah. So we would want to make a special case for the passthrough 
> device and have a mmap() and a memory region for its ESBs. Hmm.
> 
> Wouldn't that require to hot plug a memory region under the guest ? 

No; the way that a memory region works is that userspace can do
whatever disparate mappings it likes within the region on the user
process side, and the corresponding region of guest real address space
follows that automatically.

> which means that we need to provision an address space/container 
> region for theses regions. What are the benefits ? 
> 
> Is clearing the PTEs and repopulating the VMA unsafe ? 

Explicitly unmapping parts of the VMA seems like the wrong way to do
it.  If you have a device mmap where the device wants to change the
physical page underlying parts of the mapping, there should be a way
for it to do that explicitly (but I don't know off the top of my head
what the interface to do that is).

However, I still haven't seen a clear and concise explanation of what
is being changed, when, and why we need to do that.

Paul.

Re: [PATCH 18/19] KVM: PPC: Book3S HV: add passthrough support

2019-01-28 Thread Paul Mackerras

On Mon, Jan 28, 2019 at 07:26:00PM +0100, Cédric Le Goater wrote:
> 
> Is clearing the PTEs and repopulating the VMA unsafe ? 

Actually, now that I come to think of it, there could be any number of
VMAs (well, up to almost 64k of them), since once you have a file
descriptor you can call mmap on it multiple times.

The more I think about it, the more I think that getting userspace to
manage the mappings with mmap() and munmap() is the right idea if it
can be made to work.

Paul.

Re: [RFC PATCH 1/2] powerpc/powernv: Add support for CXL mode switch that need PHB reset

2019-01-28 Thread Vaibhav Jain

Thanks for reviewing this patch Christophe,

christophe lombard  writes:

>> 
>>  pe = pnv_ioda_get_pe(dev);
>>  if (!pe)
>> -return -ENODEV;
>> +return -ENOENT;
>
> The return code of pnv_phb_to_cxl_mode() is also returned by an api in 
> the cxllib librarie. So, hoping that nobody test the value !!

Agreed. I did peek into cxllib_switch_phb_mode() before sending the
patch and saw two conflicting cases. While switching to CXL_MODE_PCI we
make sure that we return kernel error codes and while switching to
CXL_MODE_CXL we return OPAL error codes.

I havent seen how CX5 handles return values from this function but I am
betting thats its the usual zero & non-zero return value check, which
then should work with the proposed change.

-- 
Vaibhav Jain 
Linux Technology Center, IBM India Pvt. Ltd.

[PATCH v3] cxl: Wrap iterations over afu slices inside 'afu_list_lock'

2019-01-28 Thread Vaibhav Jain

Within cxl module, iteration over array 'adapter->slices' may be racy
at few points as it might be simultaneously read during an EEH and its
contents being set to NULL while driver is being unloaded or unbound
from the adapter. This might result in a NULL pointer to 'struct afu'
being de-referenced during an EEH thereby causing a kernel oops.

This patch fixes this by making sure that all access to the array
'adapter->slices' is wrapped within the context of spin-lock
'adapter->afu_list_lock'.

Signed-off-by: Vaibhav Jain 
---
Changelog:

v3:
* Updated a slice loop in cxl_pci_error_detectected() to ignore NULL
  slices [Fred]
* Added a NULL AFU check in cxl_pci_slot_reset() [Fred]

v2:
* Fixed a wrong comparison of non-null pointer [Fred]
* Moved a call to cxl_vphb_error_detected() within a branch that
  checks for not null AFU pointer in 'adapter->slices' [Fred]
* Removed a misleading comment in code.
---
 drivers/misc/cxl/guest.c |  2 ++
 drivers/misc/cxl/pci.c   | 39 ++-
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c
index 5d28d9e454f5..08f4a512afad 100644
--- a/drivers/misc/cxl/guest.c
+++ b/drivers/misc/cxl/guest.c
@@ -267,6 +267,7 @@ static int guest_reset(struct cxl *adapter)
int i, rc;
 
pr_devel("Adapter reset request\n");
+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
if ((afu = adapter->afu[i])) {
pci_error_handlers(afu, CXL_ERROR_DETECTED_EVENT,
@@ -283,6 +284,7 @@ static int guest_reset(struct cxl *adapter)
pci_error_handlers(afu, CXL_RESUME_EVENT, 0);
}
}
+   spin_unlock(&adapter->afu_list_lock);
return rc;
 }
 
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index c79ba1c699ad..300531d6136f 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1805,7 +1805,7 @@ static pci_ers_result_t cxl_vphb_error_detected(struct 
cxl_afu *afu,
/* There should only be one entry, but go through the list
 * anyway
 */
-   if (afu->phb == NULL)
+   if (afu == NULL || afu->phb == NULL)
return result;
 
list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) {
@@ -1832,7 +1832,8 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 {
struct cxl *adapter = pci_get_drvdata(pdev);
struct cxl_afu *afu;
-   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET, afu_result;
+   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
+   pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET;
int i;
 
/* At this point, we could still have an interrupt pending.
@@ -1843,6 +1844,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 
/* If we're permanently dead, give up. */
if (state == pci_channel_io_perm_failure) {
+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
/*
@@ -1851,6 +1853,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 */
cxl_vphb_error_detected(afu, state);
}
+   spin_unlock(&adapter->afu_list_lock);
return PCI_ERS_RESULT_DISCONNECT;
}
 
@@ -1932,11 +1935,17 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 * * In slot_reset, free the old resources and allocate new ones.
 * * In resume, clear the flag to allow things to start.
 */
+
+   /* Make sure no one else changes the afu list */
+   spin_lock(&adapter->afu_list_lock);
+
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
 
-   afu_result = cxl_vphb_error_detected(afu, state);
+   if (afu == NULL)
+   continue;
 
+   afu_result = cxl_vphb_error_detected(afu, state);
cxl_context_detach_all(afu);
cxl_ops->afu_deactivate_mode(afu, afu->current_mode);
pci_deconfigure_afu(afu);
@@ -1948,6 +1957,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 (result == PCI_ERS_RESULT_NEED_RESET))
result = PCI_ERS_RESULT_NONE;
}
+   spin_unlock(&adapter->afu_list_lock);
 
/* should take the context lock here */
if (cxl_adapter_context_lock(adapter) != 0)
@@ -1980,14 +1990,18 @@ static pci_ers_result_t cxl_pci_slot_reset(struct 
pci_dev *pdev)
 */
cxl_adapter_context_unlock(adapter);
 
+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
 
+   if (afu == NULL)
+

Re: [PATCH v2] cxl: Wrap iterations over afu slices inside 'afu_list_lock'

2019-01-28 Thread Vaibhav Jain

Thanks for reviewing this patch Fred. I have addressed all your review
comments in v3 of this patch.

-- 
Vaibhav Jain 
Linux Technology Center, IBM India Pvt. Ltd.

Re: [PATCH v3] cxl: Wrap iterations over afu slices inside 'afu_list_lock'

2019-01-28 Thread Andrew Donnellan


On 29/1/19 5:15 pm, Vaibhav Jain wrote:

Within cxl module, iteration over array 'adapter->slices' may be racy


adapter->slices isn't an array, adapter->afu is the array.


at few points as it might be simultaneously read during an EEH and its
contents being set to NULL while driver is being unloaded or unbound
from the adapter. This might result in a NULL pointer to 'struct afu'
being de-referenced during an EEH thereby causing a kernel oops.

This patch fixes this by making sure that all access to the array
'adapter->slices' is wrapped within the context of spin-lock
'adapter->afu_list_lock'.

Signed-off-by: Vaibhav Jain 


Does this need to go to stable? (I'm guessing we've been hitting actual 
Oopses?)


Acked-by: Andrew Donnellan 


---
Changelog:

v3:
* Updated a slice loop in cxl_pci_error_detectected() to ignore NULL
   slices [Fred]
* Added a NULL AFU check in cxl_pci_slot_reset() [Fred]

v2:
* Fixed a wrong comparison of non-null pointer [Fred]
* Moved a call to cxl_vphb_error_detected() within a branch that
   checks for not null AFU pointer in 'adapter->slices' [Fred]
* Removed a misleading comment in code.
---
  drivers/misc/cxl/guest.c |  2 ++
  drivers/misc/cxl/pci.c   | 39 ++-
  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c
index 5d28d9e454f5..08f4a512afad 100644
--- a/drivers/misc/cxl/guest.c
+++ b/drivers/misc/cxl/guest.c
@@ -267,6 +267,7 @@ static int guest_reset(struct cxl *adapter)
int i, rc;
  
  	pr_devel("Adapter reset request\n");

+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
if ((afu = adapter->afu[i])) {
pci_error_handlers(afu, CXL_ERROR_DETECTED_EVENT,
@@ -283,6 +284,7 @@ static int guest_reset(struct cxl *adapter)
pci_error_handlers(afu, CXL_RESUME_EVENT, 0);
}
}
+   spin_unlock(&adapter->afu_list_lock);
return rc;
  }
  
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c

index c79ba1c699ad..300531d6136f 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1805,7 +1805,7 @@ static pci_ers_result_t cxl_vphb_error_detected(struct 
cxl_afu *afu,
/* There should only be one entry, but go through the list
 * anyway
 */
-   if (afu->phb == NULL)
+   if (afu == NULL || afu->phb == NULL)
return result;
  
  	list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) {

@@ -1832,7 +1832,8 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
  {
struct cxl *adapter = pci_get_drvdata(pdev);
struct cxl_afu *afu;
-   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET, afu_result;
+   pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET;
+   pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET;
int i;
  
  	/* At this point, we could still have an interrupt pending.

@@ -1843,6 +1844,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
  
  	/* If we're permanently dead, give up. */

if (state == pci_channel_io_perm_failure) {
+   spin_lock(&adapter->afu_list_lock);
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
/*
@@ -1851,6 +1853,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 */
cxl_vphb_error_detected(afu, state);
}
+   spin_unlock(&adapter->afu_list_lock);
return PCI_ERS_RESULT_DISCONNECT;
}
  
@@ -1932,11 +1935,17 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev,

 * * In slot_reset, free the old resources and allocate new ones.
 * * In resume, clear the flag to allow things to start.
 */
+
+   /* Make sure no one else changes the afu list */
+   spin_lock(&adapter->afu_list_lock);
+
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
  
-		afu_result = cxl_vphb_error_detected(afu, state);

+   if (afu == NULL)
+   continue;
  
+		afu_result = cxl_vphb_error_detected(afu, state);

cxl_context_detach_all(afu);
cxl_ops->afu_deactivate_mode(afu, afu->current_mode);
pci_deconfigure_afu(afu);
@@ -1948,6 +1957,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
 (result == PCI_ERS_RESULT_NEED_RESET))
result = PCI_ERS_RESULT_NONE;
}
+   spin_unlock(&adapter->afu_list_lock);
  
  	/* should take the context lock here */

if (cxl_adapter_context_lock(adapter) != 0)
@@ -1980,14 +1990,18 @@ static pci_ers_result_t cxl_pci_slot_reset(struct 
pci_dev *pdev)
 */
cxl_adapter_context

Re: [RFC PATCH 1/2] powerpc/powernv: Add support for CXL mode switch that need PHB reset

2019-01-28 Thread Vaibhav Jain

Frederic Barrat  writes:
>> +opal_poll_events(NULL);
>
> Why is a call to opal_poll_events() needed?

Trying to make sure that opal pollers are run on the current CPU and any
opal timer are executed.

-- 
Vaibhav Jain 
Linux Technology Center, IBM India Pvt. Ltd.

90 matches

Mail list logo