[PATCH v3] powerpc/eeh: avoid possible crash when edev->pdev changes

2024-06-17 Thread Ganesh Goudar
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev will change and can cause a crash, hold the PCI rescan/remove lock while taking a copy of edev->pdev->bus. Signed-off-by: Ganesh Goudar --- v2: Hold rescan lock till we get the bus address. v3: Now that we are taking co

[PATCH v2 0/1] Parallel EEH recovery between PHBs

2024-06-17 Thread Ganesh Goudar
y. On powernv the improvement is not so significant. Ganesh Goudar (1): powerpc/eeh: Enable PHBs to recovery in parallel arch/powerpc/include/asm/eeh_event.h | 7 arch/powerpc/include/asm/pci-bridge.h | 4 ++ arch/powerpc/kernel/eeh_driver.c | 27 +++- arch/powerpc/k

[PATCH v2 1/1] powerpc/eeh: Enable PHBs to recovery in parallel

2024-06-17 Thread Ganesh Goudar
. Signed-off-by: Ganesh Goudar --- v2: Include missing hunk, which modifies __eeh_send_failure_event. --- arch/powerpc/include/asm/eeh_event.h | 7 arch/powerpc/include/asm/pci-bridge.h | 4 ++ arch/powerpc/kernel/eeh_driver.c | 27 +++- arch/powerpc/kernel/eeh_event.c | 59

[PATCH v2] powerpc/eeh: avoid possible crash when edev->pdev changes

2024-06-13 Thread Ganesh Goudar
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev will change and can cause a crash, hold the PCI rescan/remove lock while taking a copy of edev->pdev. Signed-off-by: Ganesh Goudar --- v2: Hold rescan lock till we get the bus address. --- arch/powerpc/kernel/eeh_pe

[PATCH] powerpc/eeh: avoid possible crash when edev->pdev changes

2024-05-27 Thread Ganesh Goudar
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev will change and can cause a crash, hold the PCI rescan/remove lock while taking a copy of edev->pdev. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/eeh_pe.c | 2 ++ 1 file changed, 2 insertions(+) diff --git

[PATCH v2] powerpc/eeh: Permanently disable the removed device

2024-04-22 Thread Ganesh Goudar
nt if the state is not moved to permanent failure state. Signed-off-by: Ganesh Goudar --- V2: * Elobrate the commit message. * Fix formatting issues in commit message and comments. --- arch/powerpc/kernel/eeh.c| 11 ++- arch/powerpc/kernel/eeh_driver.c | 13 +++-- 2 files ch

[PATCH] powerpc/eeh: Permanently disable the removed device

2024-04-05 Thread Ganesh Goudar
like failover. Permanently disable the device if the presence check fails. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/eeh.c| 4 +++- arch/powerpc/kernel/eeh_driver.c | 8 +++- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/po

[PATCH 1/1] powerpc/eeh: Enable PHBs to recovery in parallel

2024-02-25 Thread Ganesh Goudar
. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/eeh_event.h | 7 + arch/powerpc/include/asm/pci-bridge.h | 4 +++ arch/powerpc/kernel/eeh_driver.c | 27 +-- arch/powerpc/kernel/eeh_event.c | 38 ++- arch/powerpc/kernel/eeh_pe.c

[PATCH 0/1] Parallel EEH recovery between PHBs

2024-02-25 Thread Ganesh Goudar
y. On powernv the improvement is not so significant. Ganesh Goudar (1): powerpc/eeh: Enable PHBs to recovery in parallel arch/powerpc/include/asm/eeh_event.h | 7 + arch/powerpc/include/asm/pci-bridge.h | 4 +++ arch/powerpc/kernel/eeh_driver.c | 27 +-- arch/powerpc/k

[RFC PATCH v2 3/3] powerpc/eeh: Asynchronous recovery

2023-07-24 Thread Ganesh Goudar
n the constraint, above, the driver handlers are called by traversing the tree of affected PEs from the top, stopping to call handlers (in parallel) when a PE with devices is discovered. When the calls for that PE are complete, traversal continues at each child PE. Signed-off-by: Ganesh Goudar ---

[RFC PATCH v2 2/3] powerpc/eeh: Provide a unique ID for each EEH recovery

2023-07-24 Thread Ganesh Goudar
Based on the original work from Sam Bobroff. Give a unique ID to each recovery event, to ease log parsing and prepare for parallel recovery. Also add some new messages with a very simple format that may be useful to log-parsers. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm

[RFC PATCH v2 1/3] powerpc/eeh: Synchronization for safety

2023-07-24 Thread Ganesh Goudar
blocking may be required. Care must be taken when ordering these locks against the PCI rescan/remove lock and the device locks to avoid deadlocking. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/eeh.h | 12 +- arch/powerpc/kernel/eeh.c| 112

[RFC PATCH v2 0/3] Asynchronous EEH recovery

2023-07-24 Thread Ganesh Goudar
, Please comment. Thanks. V2: * Since we now have event list per phb, Have per phb event list lock. * Appropriate names given to the locks. * Remove stale comments (few more to be removed). * Initialize event_id to 0 instead of 1. * And some cosmetic changes. Ganesh Goudar (3): powerpc/eeh

[RFC 2/3] powerpc/eeh: Provide a unique ID for each EEH recovery

2023-06-12 Thread Ganesh Goudar
Based on the original work from Sam Bobroff. Give a unique ID to each recovery event, to ease log parsing and prepare for parallel recovery. Also add some new messages with a very simple format that may be useful to log-parsers. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm

[RFC 3/3] powerpc/eeh: Asynchronous recovery

2023-06-12 Thread Ganesh Goudar
n the constraint, above, the driver handlers are called by traversing the tree of affected PEs from the top, stopping to call handlers (in parallel) when a PE with devices is discovered. When the calls for that PE are complete, traversal continues at each child PE. Signed-off-by: Ganesh Goudar ---

[RFC 1/3] powerpc/eeh: Synchronization for safety

2023-06-12 Thread Ganesh Goudar
blocking may be required. Care must be taken when ordering these locks against the PCI rescan/remove lock and the device locks to avoid deadlocking. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/eeh.h | 6 +- arch/powerpc/kernel/eeh.c| 112

[RFC 0/3] Asynchronous EEH recovery

2023-06-12 Thread Ganesh Goudar
, Please comment. Thanks. Ganesh Goudar (3): powerpc/eeh: Synchronization for safety powerpc/eeh: Provide a unique ID for each EEH recovery powerpc/eeh: Asynchronous recovery arch/powerpc/include/asm/eeh.h | 7 +- arch/powerpc/include/asm/eeh_event.h | 10

[PATCH] powerpc/eeh: Set channel state after notifying the drivers

2023-02-09 Thread Ganesh Goudar
ent failure)' To fix the issue, set channel state to permanent failure after notifying the drivers. Fixes: 38ddc011478e ("powerpc/eeh: Make permanently failed devices non-actionable") Suggested-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/eeh_driver.

[PATCH v3] powerpc/mce: log the error for all unrecoverable errors

2023-02-01 Thread Ganesh Goudar
NIP: [1e48] MCE: CPU24: Initiator CPU MCE: CPU24: Unknown RTAS: event: 5, Type: Platform Error (224), Severity: 3 Signed-off-by: Ganesh Goudar Reviewed-by: Mahesh Salgaonkar --- V3: Rephrasing the commit message. --- arch/powerpc/kernel/mce.c | 10 +++--- 1 file changed, 7

[PATCH v2] powerpc/mce: log the error for all unrecoverable errors

2023-01-27 Thread Ganesh Goudar
/Store (foreign/control memory) [Not recovered] MCE: CPU24: PID: 1589811 Comm: inject-ra-err NIP: [1e48] MCE: CPU24: Initiator CPU MCE: CPU24: Unknown RTAS: event: 5, Type: Platform Error (224), Severity: 3 Signed-off-by: Ganesh Goudar Reviewed-by: Mahesh Salgaonkar --- V2

[PATCH] powerpc/mce: log the error for all unrecoverable errors

2022-11-13 Thread Ganesh Goudar
machine_check_log_err() is not getting called for all unrecoverable errors, And we are missing to log the error. Raise irq work in save_mce_event() for unrecoverable errors, So that we log the error from MCE event handling block in timer handler. Signed-off-by: Ganesh Goudar --- arch/powerpc

[PATCH v3] powerpc/pseries/mce: Avoid instrumentation in realmode

2022-09-25 Thread Ganesh Goudar
KASAN instrumentation. Signed-off-by: Ganesh Goudar --- v2: Force inline few more functions. v3: Adding noinstr to few functions instead of __always_inline. --- arch/powerpc/include/asm/hw_irq.h| 8 arch/powerpc/include/asm/interrupt.h | 2 +- arch/powerpc/include/asm/rtas.h | 4

[PACTH v2] powerpc/pseries/mce: Avoid instrumentation in realmode

2022-09-04 Thread Ganesh Goudar
KASAN instrumentation. Signed-off-by: Ganesh Goudar --- v2: Force inline few more functions. --- arch/powerpc/include/asm/hw_irq.h| 8 arch/powerpc/include/asm/interrupt.h | 2 +- arch/powerpc/include/asm/rtas.h | 4 ++-- arch/powerpc/kernel/rtas.c | 4 ++-- 4 files

[PATCH] powerpc/pseries/mce: Avoid instrumentation in realmode

2022-08-29 Thread Ganesh Goudar
KASAN instrumentation. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/interrupt.h | 2 +- arch/powerpc/include/asm/rtas.h | 4 ++-- arch/powerpc/kernel/rtas.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/interrupt.h b

[RFC 3/3] powerpc/eeh: Asynchronous recovery

2022-08-15 Thread Ganesh Goudar
n the constraint, above, the driver handlers are called by traversing the tree of affected PEs from the top, stopping to call handlers (in parallel) when a PE with devices is discovered. When the calls for that PE are complete, traversal continues at each child PE. Signed-off-by: Ganesh Goudar ---

[RFC 1/3] powerpc/eeh: Synchronization for safety

2022-08-15 Thread Ganesh Goudar
blocking may be required. Care must be taken when ordering these locks against the PCI rescan/remove lock and the device locks to avoid deadlocking. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/eeh.h | 6 +- arch/powerpc/kernel/eeh.c| 112

[RFC 0/3] Asynchronous EEH recovery

2022-08-15 Thread Ganesh Goudar
in time taken in EEH recovery, Yet to be tested on powernv. These patches were originally posted as separate RFCs, I think posting them as single series would be more helpful, I know the patches are too big, I will try to logically divide in next iterations. Thanks Ganesh Goudar (3): powerpc

[RFC 2/3] powerpc/eeh: Provide a unique ID for each EEH recovery

2022-08-15 Thread Ganesh Goudar
Based on the original work from Sam Bobroff. Give a unique ID to each recovery event, to ease log parsing and prepare for parallel recovery. Also add some new messages with a very simple format that may be useful to log-parsers. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm

[PATCH v5] powerpc/mce: Avoid using irq_work_queue() in realmode

2022-01-20 Thread Ganesh Goudar
in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar --- V2: * Use arch_irq_work_raise to raise decrementer interrupt. * Avoid having atomic variable. V3: * Fix build error. Reported by kernel test bot

[PATCH v4] powerpc/mce: Avoid using irq_work_queue() in realmode

2022-01-17 Thread Ganesh Goudar
in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar --- V2: * Use arch_irq_work_raise to raise decrementer interrupt. * Avoid having atomic variable. V3: * Fix build error. Reported by kernel test bot

[PATCH v3 RESEND 3/3] powerpc/mce: Modify the real address error logging messages

2022-01-07 Thread Ganesh Goudar
pace. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index fd829f7f25a4..55ccc651d1b0 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.

[PATCH v3 RESEND 1/3] powerpc/pseries: Parse control memory access error

2022-01-07 Thread Ganesh Goudar
Add support to parse and log control memory access error for pseries. These changes are made according to PAPR v2.11 10.3.2.2.12. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 36 1 file changed, 32 insertions(+), 4 deletions(-) diff --git

[PATCH v3 RESEND 2/3] selftests/powerpc: Add test for real address error handling

2022-01-07 Thread Ganesh Goudar
receives SIGBUS. Signed-off-by: Ganesh Goudar --- tools/testing/selftests/powerpc/Makefile | 3 +- tools/testing/selftests/powerpc/mce/Makefile | 7 ++ .../selftests/powerpc/mce/inject-ra-err.c | 65 +++ tools/testing/selftests/powerpc/mce/vas-api.h | 1 + 4 files changed

[PATCH v3 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-24 Thread Ganesh Goudar
in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar --- V2: * Use arch_irq_work_raise to raise decrementer interrupt. * Avoid having atomic variable. V3: * Fix build error. Reported by kernel test bot

[PATCH v3 2/2] pseries/mce: Refactor the pseries mce handling code

2021-11-24 Thread Ganesh Goudar
to enabled. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 122 +++ 1 file changed, 49 insertions(+), 73 deletions(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 8613f9cc5798..62e1519b8355 100644

[PATCH v2 2/2] pseries/mce: Refactor the pseries mce handling code

2021-11-23 Thread Ganesh Goudar
to enabled. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 122 +++ 1 file changed, 49 insertions(+), 73 deletions(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 8613f9cc5798..62e1519b8355 100644

[PATCH v2 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-23 Thread Ganesh Goudar
in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar --- V2: * Use arch_irq_work_raise to raise decrementer interrupt. * Avoid having atomic variable. --- arch/powerpc/include/asm/machdep.h | 2

[PATCH 2/2] pseries/mce: Refactor the pseries mce handling code

2021-11-08 Thread Ganesh Goudar
to enabled. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 122 +++ 1 file changed, 49 insertions(+), 73 deletions(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 8613f9cc5798..62e1519b8355 100644

[PATCH 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-08 Thread Ganesh Goudar
in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/machdep.h | 2 + arch/powerpc/include/asm/mce.h | 2 + arch/powerpc/include/asm/paca.h | 1

[PATCH v2] powerpc/mce: Fix access error in mce handler

2021-09-08 Thread Ganesh Goudar
+0xbc/0xd0 [c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4 Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler") Signed-off-by: Ganesh Goudar --- v2: Change in commit message. --- arch/powerpc/kernel/mce.c | 16 ++-- 1 file ch

[PATCH v3 3/3] powerpc/mce: Modify the real address error logging messages

2021-09-06 Thread Ganesh Goudar
pace. Signed-off-by: Ganesh Goudar --- v3: No changes. v2: No changes. --- arch/powerpc/kernel/mce.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 9d1e39d42e3e..5baf69503349 100644 --- a/arch/powerpc/ker

[PATCH v3 2/3] selftests/powerpc: Add test for real address error handling

2021-09-06 Thread Ganesh Goudar
receives SIGBUS. Signed-off-by: Ganesh Goudar --- v3: Avoid using shell script to inject error. v2: Fix build error. --- tools/testing/selftests/powerpc/Makefile | 3 +- tools/testing/selftests/powerpc/mce/Makefile | 7 ++ .../selftests/powerpc/mce/inject-ra-err.c | 65

[PATCH v3 1/3] powerpc/pseries: Parse control memory access error

2021-09-06 Thread Ganesh Goudar
Add support to parse and log control memory access error for pseries. These changes are made according to PAPR v2.11 10.3.2.2.12. Signed-off-by: Ganesh Goudar --- v3: Modify the commit log to mention the document according to which changes are made. Define and use a macro to check if the

[PATCH] powerpc/mce: Fix access error in mce handler

2021-09-06 Thread Ganesh Goudar
] machine_check_queue_event+0xbc/0xd0 [c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4 Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 16 ++-- 1 file c

[PATCH] powerpc/mce: check if event info is valid

2021-08-06 Thread Ganesh Goudar
countered in L2, as event structure will be empty in L1. "Machine Check Exception, Unknown event version 0". Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm/mce.h | 2 +- arch/powerpc/kernel/mce.c | 7 +-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/ar

[PATCH v2 3/3] powerpc/mce: Modify the real address error logging messages

2021-08-05 Thread Ganesh Goudar
pace. Signed-off-by: Ganesh Goudar --- v2: No changes in this patch. --- arch/powerpc/kernel/mce.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 47a683cd00d2..f3ef480bb739 100644 --- a/arch/powerpc/ker

[PATCH v2 2/3] selftests/powerpc: Add test for real address error handling

2021-08-05 Thread Ganesh Goudar
receives SIGBUS. Signed-off-by: Ganesh Goudar --- v2: Fix build error. --- tools/testing/selftests/powerpc/Makefile | 3 +- tools/testing/selftests/powerpc/mce/Makefile | 6 +++ .../selftests/powerpc/mce/inject-ra-err.c | 42 +++ .../selftests/powerpc/mce/inject-ra-err.sh

[PATCH v2 1/3] powerpc/pseries: Parse control memory access error

2021-08-05 Thread Ganesh Goudar
Add support to parse and log control memory access error for pseries. Signed-off-by: Ganesh Goudar --- v2: No changes in this patch. --- arch/powerpc/platforms/pseries/ras.c | 21 + 1 file changed, 21 insertions(+) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch

[PATCH 1/3] powerpc/pseries: Parse control memory access error

2021-07-30 Thread Ganesh Goudar
Add support to parse and log control memory access error for pseries. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 21 + 1 file changed, 21 insertions(+) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c

[PATCH 2/3] selftests/powerpc: Add test for real address error handling

2021-07-30 Thread Ganesh Goudar
receives SIGBUS. Signed-off-by: Ganesh Goudar --- tools/testing/selftests/powerpc/Makefile | 3 +- tools/testing/selftests/powerpc/mce/Makefile | 6 +++ .../selftests/powerpc/mce/inject-ra-err.c | 42 +++ .../selftests/powerpc/mce/inject-ra-err.sh| 19 + 4 files

[PATCH 3/3] powerpc/mce: Modify the real address error logging messages

2021-07-30 Thread Ganesh Goudar
pace. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 47a683cd00d2..f3ef480bb739 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.

[PATCH] powerpc/pseries/mce: Fix a typo in error type assignment

2021-04-16 Thread Ganesh Goudar
The error type is ICACHE and DCACHE, for case MCE_ERROR_TYPE_ICACHE. Signed-off-by: Ganesh Goudar --- arch/powerpc/platforms/pseries/ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index

[PATCH] powerpc/mce: save ignore_event flag unconditionally for UE

2021-04-06 Thread Ganesh Goudar
] memcpy+0x88/0x90 [ 512.972456] MCE: CPU1: Initiator CPU [ 512.972534] MCE: CPU1: Unknown Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 11f0cae086ed

[PATCH v5 2/2] powerpc/mce: Remove per cpu variables from MCE handlers

2021-01-28 Thread Ganesh Goudar
on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Signed-off-by: Ganesh Goudar --- v2: Dynamically allocate memory for machine check event info. v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid to

[PATCH v5 1/2] powerpc/mce: Reduce the size of event arrays

2021-01-28 Thread Ganesh Goudar
Maximum recursive depth of MCE is 4, Considering the maximum depth allowed reduce the size of event to 10 from 100. This saves us ~19kB of memory and has no fatal consequences. Signed-off-by: Ganesh Goudar --- v4: This patch is a fragment of the orignal patch which is split into two. v5

[PATCH v4 2/2] powerpc/mce: Remove per cpu variables from MCE handlers

2021-01-22 Thread Ganesh Goudar
on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Signed-off-by: Ganesh Goudar --- v2: Dynamically allocate memory for machine check event info v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid to

[PATCH v4 1/2] powerpc/mce: Reduce the size of event arrays

2021-01-22 Thread Ganesh Goudar
Maximum recursive depth of MCE is 4, Considering the maximum depth allowed reduce the size of event to 10 from 100. This saves us ~19kB of memory and has no fatal consequences. Signed-off-by: Ganesh Goudar --- v4: This patch is a fragment of the orignal patch which is split into two

[PATCH v3] powerpc/mce: Remove per cpu variables from MCE handlers

2021-01-15 Thread Ganesh Goudar
on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Maximum recursive depth of MCE is 4, Considering the maximum depth allowed reduce the size of event to 10 from 100. Signed-off-by: Ganesh Goudar --- v2: Dynamically

[PATCH v2] powerpc/mce: Remove per cpu variables from MCE handlers

2021-01-07 Thread Ganesh Goudar
on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Maximum recursive depth of MCE is 4, Considering the maximum depth allowed reduce the size of event to 10 from 100. Signed-off-by: Ganesh Goudar --- v2: Dynamically

[PATCH] powerpc/mce: Remove per cpu variables from MCE handlers

2020-12-04 Thread Ganesh Goudar
on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Maximum recursive depth of MCE is 4, Considering the maximum depth allowed reduce the size of event to 10 from 100. Signed-off-by: Ganesh Goudar --- arch/powerpc/include

[PATCH v5] lkdtm/powerpc: Add SLB multihit test

2020-11-30 Thread Ganesh Goudar
To check machine check handling, add support to inject slb multihit errors. Cc: Kees Cook Cc: Michal Suchánek Co-developed-by: Mahesh Salgaonkar Signed-off-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- v5: - Insert entries at SLB_NUM_BOLTED and SLB_NUM_BOLTED +1, remove index

[PATCH v4 2/2] lkdtm/powerpc: Add SLB multihit test

2020-10-08 Thread Ganesh Goudar
To check machine check handling, add support to inject slb multihit errors. Cc: Kees Cook Reviewed-by: Michal Suchánek Co-developed-by: Mahesh Salgaonkar Signed-off-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- drivers/misc/lkdtm/Makefile | 1 + drivers/misc/lkdtm

[PATCH v4 1/2] powerpc/mce: remove nmi_enter/exit from real mode handler

2020-10-08 Thread Ganesh Goudar
on pseries machine running in hash mmu mode. Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/mc

[PATCH v4 0/2] powerpc/mce: Fix mce handler and add selftest

2020-10-08 Thread Ganesh Goudar
nesting is supported. * Fix build errors and remove unused variables. * Integrate error injection code into LKDTM. * Add support to inject multihit in paca. Ganesh Goudar (2): powerpc/mce: remove nmi_enter/exit from real mode handler lkdtm/powerpc: Add SLB multihit test arch/powerpc/kernel

[PATCH v3 2/2] lkdtm/powerpc: Add SLB multihit test

2020-10-01 Thread Ganesh Goudar
To check machine check handling, add support to inject slb multihit errors. Reviewed-by: Michal Suchánek Co-developed-by: Mahesh Salgaonkar Signed-off-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- drivers/misc/lkdtm/Makefile | 1 + drivers/misc/lkdtm/core.c

[PATCH v3 1/2] powerpc/mce: remove nmi_enter/exit from real mode handler

2020-10-01 Thread Ganesh Goudar
on pseries machine running in hash mmu mode. Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kern

[PATCH v3 0/2] powerpc/mce: Fix mce handler and add selftest

2020-10-01 Thread Ganesh Goudar
support to inject multihit in paca. Ganesh Goudar (2): powerpc/mce: remove nmi_enter/exit from real mode handler lkdtm/powerpc: Add SLB multihit test arch/powerpc/kernel/mce.c | 10 +- drivers/misc/lkdtm/Makefile | 1 + drivers/misc/lkdtm/core.c | 3

[PATCH v2 0/3] powerpc/mce: Fix mce handler and add selftest

2020-09-25 Thread Ganesh Goudar
. * Fix build errors and remove unused variables. * Integrate error injection code into LKDTM. * Add support to inject multihit in paca. Ganesh Goudar (3): powerpc/mce: remove nmi_enter/exit from real mode handler lkdtm/powerpc: Add SLB multihit test selftests/lkdtm: Enable selftest for SLB

[PATCH v2 1/3] powerpc/mce: remove nmi_enter/exit from real mode handler

2020-09-25 Thread Ganesh Goudar
on pseries machine running in hash mmu mode. Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kern

[PATCH v2 2/3] lkdtm/powerpc: Add SLB multihit test

2020-09-25 Thread Ganesh Goudar
Add support to inject slb multihit errors, to test machine check handling. Based on work by Mahesh Salgaonkar and Michal Suchánek. Cc: Mahesh Salgaonkar Cc: Michal Suchánek Signed-off-by: Ganesh Goudar --- drivers/misc/lkdtm/Makefile | 4 ++ drivers/misc/lkdtm/core.c| 3 + drivers

[PATCH v2 3/3] selftests/lkdtm: Enable selftest for SLB multihit

2020-09-25 Thread Ganesh Goudar
Add PPC_SLB_MULTIHIT to lkdtm selftest framework. Signed-off-by: Ganesh Goudar --- tools/testing/selftests/lkdtm/tests.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt index 9d266e79c6a2..7eb3cf91c89e

[PATCH 2/3] powerpc/mce: Add debugfs interface to inject MCE

2020-09-16 Thread Ganesh Goudar
To test machine check handling, add debugfs interface to inject slb multihit errors. To inject slb multihit: #echo 1 > /sys/kernel/debug/powerpc/mce_error_inject/inject_slb_multihit Signed-off-by: Ganesh Goudar Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/Kconfig.debug |

[PATCH 0/3] powerpc/mce: Fix mce handler and add selftest

2020-09-16 Thread Ganesh Goudar
possible. Ganesh Goudar (3): powerpc/mce: remove nmi_enter/exit from real mode handler powerpc/mce: Add debugfs interface to inject MCE selftest/powerpc: Add slb multihit selftest arch/powerpc/Kconfig.debug| 9 ++ arch/powerpc/kernel/mce.c | 7

[PATCH 3/3] selftest/powerpc: Add slb multihit selftest

2020-09-16 Thread Ganesh Goudar
Add selftest to check if the system recovers from slb multihit errors. Signed-off-by: Ganesh Goudar --- tools/testing/selftests/powerpc/Makefile | 3 ++- tools/testing/selftests/powerpc/mces/Makefile| 6 ++ tools/testing/selftests/powerpc/mces/slb_multihit.sh | 9

[PATCH 1/3] powerpc/mce: remove nmi_enter/exit from real mode handler

2020-09-16 Thread Ganesh Goudar
on pseries machine running in hash mmu mode. Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mc

[PATCH v4] powerpc/pseries: Avoid using addr_to_pfn in real mode

2020-07-23 Thread Ganesh Goudar
vert to use common event code") Signed-off-by: Ganesh Goudar --- V2: Leave bare metal code and save_mce_event as is. V3: Have separate functions for realmode and virtual mode handling. V4: Fix build warning, rephrase commit message. --- arch/powerpc/platforms/pse

[PATCH v3] powerpc/pseries: Avoid using addr_to_pfn in realmode

2020-07-20 Thread Ganesh Goudar
] 79291f24 790af00e 78e70020 7d095214 <7c69502a> 2fa3 419e011c 70690040 [ 485.128152] ---[ end trace d34b27e29ae0e340 ]--- Signed-off-by: Ganesh Goudar --- V2: Leave bare metal code and save_mce_event as is. V3: Have separate functions for realmode and virtual mode handling. --- arch/p

[PATCH v2] powerpc/pseries: Avoid using addr_to_pfn in realmode

2020-07-09 Thread Ganesh Goudar
fatal as it may try to access memory outside RMO region. To fix this use addr_to_pfn after switching to virtual mode. Signed-off-by: Ganesh Goudar --- V2: Leave bare metal code and save_mce_event as is. --- arch/powerpc/platforms/pseries/ras.c | 20 +++- 1 file changed, 11

[PATCH] powerpc/mce: Avoid using addr_to_pfn in realmode

2020-06-20 Thread Ganesh Goudar
fatal as it may try to access memory outside RMO region. To fix this move the use of addr_to_pfn to save_mce_event(), which runs in virtual mode. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce.c| 7 + arch/powerpc/kernel/mce_power.c | 39

[PATCH 2/2] powerpc/mce: Do not poison the memory using guest effective addr

2020-04-26 Thread Ganesh Goudar
thereby avoid poisoning the memory in host. Reviewed-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce_power.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c index

[PATCH 1/2] powerpc/mce: Add helper functions to remove duplicate code

2020-04-26 Thread Ganesh Goudar
mce_handle_ierror() and mce_handle_derror() has some duplicate code to recover from the recoverable MCE errors and to get the MCE error sub-type while generating MCE error info, Add helper functions to remove it. Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/mce_power.c | 136

[PATCH] powerpc/mce: Add MCE notification chain

2020-03-30 Thread Ganesh Goudar
From: Santosh S Introduce notification chain which lets know about uncorrected memory errors(UE). This would help prospective users in pmem or nvdimm subsystem to track bad blocks for better handling of persistent memory allocations. Signed-off-by: Santosh S Signed-off-by: Ganesh Goudar

[PATCH v4] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-03-26 Thread Ganesh Goudar
t for memcpy_mcsafe") Reviewed-by: Mahesh Salgaonkar Reviewed-by: Santosh S Signed-off-by: Ganesh Goudar --- V2: Fixes a trivial checkpatch error in commit msg. V3: Use proper subject prefix. V4: Rephrase the commit message. Define a common function to update nip with fixup address.

[PATCH v3] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-03-22 Thread Ganesh Goudar
eviewed-by: Santosh S Signed-off-by: Ganesh Goudar --- V2: Fixes a trivial checkpatch error in commit msg. V3: Use proper subject prefix. --- arch/powerpc/platforms/pseries/ras.c | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platfor

[ltcras] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-03-20 Thread Ganesh Goudar
eviewed-by: Santosh S Signed-off-by: Ganesh Goudar --- V2: Fixes a trivial checkpatch error in commit msg --- arch/powerpc/platforms/pseries/ras.c | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 5d

[PATCH v2] powerpc/pseries: Fix MCE handling on pseries

2020-03-20 Thread Ganesh Goudar
48 2f8b0063 380b0001 ---[ end trace 46fd63f36bbdd940 ]--- Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common event code") Reviewed-by: Mahesh Salgaonkar Reviewed-by: Nicholas Piggin Signed-off-by: Ganesh Goudar --- v2: Avoid asm code to switch to virtual mo

[PATCH] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-03-13 Thread Ganesh Goudar
If we hit UE at an instruction with a fixup entry, flag to ignore the event and set nip to continue execution at the fixup entry. For powernv this changes are already made by commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe") Signed-off-by: Ganesh Goudar --- ar

[PATCH] powerpc/pseries: Fix MCE handling on pseries

2020-03-13 Thread Ganesh Goudar
48 2f8b0063 380b0001 ---[ end trace 46fd63f36bbdd940 ]--- Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common event code") Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/exceptions-64s.S | 12 arch/powerpc/platforms/pseries/pseries.h

[PATCH v2] powerpc: dump kernel log before carrying out fadump or kdump

2019-09-04 Thread Ganesh Goudar
wed-by: Mahesh Salgaonkar Reviewed-by: Nicholas Piggin Signed-off-by: Ganesh Goudar --- V2: Rephrasing the commit message --- arch/powerpc/kernel/traps.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 11caa0291254..82f43535e686

[PATCH] powerpc: dump kernel log before carrying out fadump or kdump

2019-08-21 Thread Ganesh Goudar
vram, call kmsg_dump() before carrying out fadump or kdump. Fixes: 4388c9b3a6ee ("powerpc: Do not send system reset request through the oops path") Reviewed-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- arch/powerpc/kernel/traps.c | 1 + 1 file changed, 1 insertion(+) diff

[PATCH] powerpc/pseries: hwpoison the pages upon hitting UE

2019-04-15 Thread Ganesh Goudar
Add support to hwpoison the pages upon hitting machine check exception. This patch queues the address where UE is hit to percpu array and schedules work to plumb it into memory poison infrastructure. Reviewed-by: Mahesh Salgaonkar Signed-off-by: Ganesh Goudar --- arch/powerpc/include/asm