Re: [PATCH v5 3/3] powerpc/eeh: Use result of error_detected() in uevent

2025-08-07 Thread Sathyanarayanan Kuppuswamy
nel_io_frozen); edev->in_error = true; - pci_uevent_ers(pdev, PCI_ERS_RESULT_NONE); + pci_uevent_ers(pdev, rc); return rc; } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH] PCI/AER: Check for NULL aer_info before ratelimiting in pci_print_aer()

2025-08-04 Thread Sathyanarayanan Kuppuswamy
On 8/4/25 8:35 AM, Breno Leitao wrote: Hello Sathyanarayanan, On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: On 8/4/25 2:17 AM, Breno Leitao wrote: Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called when dev->aer_info is NULL. Add a NULL ch

Re: [PATCH] PCI/AER: Check for NULL aer_info before ratelimiting in pci_print_aer()

2025-08-04 Thread Sathyanarayanan Kuppuswamy
dev->aer_info) + return 1; + switch (severity) { case AER_NONFATAL: return __ratelimit(&dev->aer_info->nonfatal_ratelimit); --- base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8 change-id: 20250801-aer_crash_2-b21cc2ef0d00 Best regards, -- Breno Le

Re: [PATCH v3 1/2] PCI/AER: Fix missing uevent on recovery when a reset is requested

2025-08-01 Thread Sathyanarayanan Kuppuswamy
Hi Lukas, On 7/31/25 10:44 PM, Lukas Wunner wrote: On Thu, Jul 31, 2025 at 10:04:38AM -0700, Sathyanarayanan Kuppuswamy wrote: On 7/31/25 6:01 AM, Lukas Wunner wrote: +++ b/drivers/pci/pcie/err.c @@ -165,6 +165,12 @@ static int report_resume(struct pci_dev *dev, void *data) return 0

Re: [PATCH v3 1/2] PCI/AER: Fix missing uevent on recovery when a reset is requested

2025-07-31 Thread Sathyanarayanan Kuppuswamy
h may be a Port, an RCEC, or an RCiEP @@ -272,7 +278,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, failed: pci_walk_bridge(bridge, pci_pm_runtime_put, NULL); - pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT); + pci_walk_bridge(bridge, report_disconnect, NULL);

Re: [PATCH 4/4 v3] ACPI: extlog: Trace CPER CXL Protocol Error Section

2025-06-03 Thread Sathyanarayanan Kuppuswamy
t cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data } #endif +void cxl_cper_ras_handle_prot_err(struct cxl_cper_prot_err_work_data *wd); + #endif /* _LINUX_CXL_EVENT_H */ -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH 3/4 v3] ACPI: extlog: Trace CPER PCI Express Error Section

2025-06-03 Thread Sathyanarayanan Kuppuswamy
struct aer_capability_regs *aer); int cper_severity_to_aer(int cper_severity); void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, int severity, struct aer_capability_regs *aer_regs); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH 2/4 v3] PCI/AER: Modify pci_print_aer() to take log level

2025-06-03 Thread Sathyanarayanan Kuppuswamy
ecover_queue(int domain, unsigned int bus, unsigned int devfn, int severity, struct aer_capability_regs *aer_regs); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v8 16/20] PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index

2025-05-22 Thread Sathyanarayanan Kuppuswamy
status); if (dpc_get_aer_uncorrect_severity(pdev, &info) && - aer_get_device_error_info(pdev, &info)) { - aer_print_error(pdev, &info); + aer_get_device_error_info(&info, 0)) { + aer_print_error(&info,

Re: [PATCH v8 17/20] PCI/AER: Simplify add_error_device()

2025-05-22 Thread Sathyanarayanan Kuppuswamy
} - return -ENOSPC; + int i = e_info->error_dev_num; + + if (i >= AER_MAX_MULTI_ERR_DEVICES) + return -ENOSPC; + + e_info->dev[i] = pci_dev_get(dev); + e_info->error_dev_num++; + + return 0; } /** -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v8 18/20] PCI/AER: Ratelimit correctable and non-fatal error logging

2025-05-22 Thread Sathyanarayanan Kuppuswamy
n error logged in its AER +* Capability. +* +* If we didn't find the Error Source device, at least log the +* Requester ID from the ERR_* Message received by the Root Port or +* RCEC, ratelimited by the RP or RCEC. +*/ + if (info->root_ratelimit_print || + (!found && aer_ratelimit(root, info->severity))) + aer_print_source(root, info, found); + if (found) aer_process_err_devices(info); } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v8 20/20] PCI/AER: Add sysfs attributes for log ratelimits

2025-05-22 Thread Sathyanarayanan Kuppuswamy
+ if (!pdev->aer_info) + return 0; + + return a->mode; +} + +const struct attribute_group aer_attr_group = { + .name = "aer", + .attrs = aer_attrs, + .is_visible = aer_attrs_are_visible, +}; + static void pci_dev_aer_stats_incr(struct pci_dev *pdev, struct aer_err_info *info) { -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v8 13/20] PCI/ERR: Add printk level to pcie_print_tlp_log()

2025-05-22 Thread Sathyanarayanan Kuppuswamy
@@ -130,6 +132,6 @@ void pcie_print_tlp_log(const struct pci_dev *dev, } } - pci_err(dev, "%sTLP Header%s: %s\n", pfx, + dev_printk(level, &dev->dev, "%sTLP Header%s: %s\n", pfx, log->flit ? " (Flit)" : "", buf); } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH 0/4] pci: implement "pci=aer_panic"

2025-05-21 Thread Sathyanarayanan Kuppuswamy
On 5/21/25 7:54 AM, Hans Zhang wrote: On 2025/5/21 00:09, Sathyanarayanan Kuppuswamy wrote: On 5/19/25 7:41 AM, Hans Zhang wrote: On 2025/5/19 22:21, Hans Zhang wrote: On 2025/5/17 02:10, Sathyanarayanan Kuppuswamy wrote: On 5/16/25 9:55 AM, Hans Zhang wrote: The following series

Re: [PATCH v7 15/17] PCI/AER: Ratelimit correctable and non-fatal error logging

2025-05-20 Thread Sathyanarayanan Kuppuswamy
v *pdev) status); if (dpc_get_aer_uncorrect_severity(pdev, &info) && aer_get_device_error_info(pdev, &info)) { + info.ratelimit = 1; /* ERR_FATAL; no ratelimit */ aer_print_error(pdev, &info); pci_aer_clear_nonfatal_status(pdev); pci_aer_clear_fatal_status(pdev); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v7 17/17] PCI/AER: Add sysfs attributes for log ratelimits

2025-05-20 Thread Sathyanarayanan Kuppuswamy
(dev); + + if (!pdev->aer_info) + return 0; + + return a->mode; +} + +const struct attribute_group aer_attr_group = { + .name = "aer", + .attrs = aer_attrs, + .is_visible = aer_attrs_are_visible, +}; + static void pci_dev_aer_stats_incr(struct pci_dev *pdev, struct aer_err_info *info) { -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v7 09/17] PCI/AER: Simplify pci_print_aer()

2025-05-20 Thread Sathyanarayanan Kuppuswamy
& ~mask), + trace_aer_event(pci_name(dev), (status & ~mask), aer_severity, tlp_header_valid, &aer->header_log); } EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL"); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v7 04/17] PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()

2025-05-20 Thread Sathyanarayanan Kuppuswamy

Re: [PATCH v7 03/17] PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()

2025-05-20 Thread Sathyanarayanan Kuppuswamy
er_isr_one_error(rpc->rpd, &e_src); return IRQ_HANDLED; } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation

2025-05-20 Thread Sathyanarayanan Kuppuswamy
On 5/20/25 12:48 PM, Bjorn Helgaas wrote: On Mon, May 19, 2025 at 10:01:09PM -0700, Sathyanarayanan Kuppuswamy wrote: On 5/19/25 2:35 PM, Bjorn Helgaas wrote: From: Jon Pan-Doh Add ratelimits section for rationale and defaults. +AER Ratelimits +-- + +Since error messages can be

Re: [PATCH v6 14/16] PCI/AER: Introduce ratelimit for error logs

2025-05-20 Thread Sathyanarayanan Kuppuswamy
On 5/20/25 11:31 AM, Bjorn Helgaas wrote: On Mon, May 19, 2025 at 09:59:29PM -0700, Sathyanarayanan Kuppuswamy wrote: On 5/19/25 2:35 PM, Bjorn Helgaas wrote: From: Jon Pan-Doh Spammy devices can flood kernel logs with AER errors and slow/stall execution. Add per-device ratelimits for AER

Re: [PATCH 0/4] pci: implement "pci=aer_panic"

2025-05-20 Thread Sathyanarayanan Kuppuswamy
On 5/19/25 7:41 AM, Hans Zhang wrote: On 2025/5/19 22:21, Hans Zhang wrote: On 2025/5/17 02:10, Sathyanarayanan Kuppuswamy wrote: On 5/16/25 9:55 AM, Hans Zhang wrote: The following series introduces a new kernel command-line option aer_panic to enhance error handling for PCIe Advanced

Re: [PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation

2025-05-19 Thread Sathyanarayanan Kuppuswamy
(10 events) over +DEFAULT_RATELIMIT_INTERVAL (5 seconds). + AER Statistics / Counters - -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 16/16] PCI/AER: Add sysfs attributes for log ratelimits

2025-05-19 Thread Sathyanarayanan Kuppuswamy
de; +} + +const struct attribute_group aer_attr_group = { + .name = "aer", + .attrs = aer_attrs, + .is_visible = aer_attrs_are_visible, +}; + static void pci_dev_aer_stats_incr(struct pci_dev *pdev, struct aer_err_info *info) { -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 14/16] PCI/AER: Introduce ratelimit for error logs

2025-05-19 Thread Sathyanarayanan Kuppuswamy
mit = 1; /* no ratelimiting */ aer_print_error(pdev, &info); pci_aer_clear_nonfatal_status(pdev); pci_aer_clear_fatal_status(pdev); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 13/16] PCI/AER: Rename struct aer_stats to aer_report

2025-05-19 Thread Sathyanarayanan Kuppuswamy
_cap;/* AER capability offset */ - struct aer_stats *aer_stats;/* AER stats for this device */ + struct aer_report *aer_report; /* AER report for this device */ #endif #ifdef CONFIG_PCIEPORTBUS struct rcec_ea *rcec_ea; /* RCEC cached endpoint association */ -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 12/16] PCI/AER: Make all pci_print_aer() log levels depend on error type

2025-05-19 Thread Sathyanarayanan Kuppuswamy
alid is set, and info.level is always KERN_ERR in +* that case. +*/ if (tlp_header_valid) pcie_print_tlp_log(dev, &aer->header_log, dev_fmt(" ")); } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 11/16] PCI/AER: Check log level once and remember it

2025-05-19 Thread Sathyanarayanan Kuppuswamy
t_severity(struct pci_dev *dev, else info->severity = AER_NONFATAL; + info->level = KERN_WARNING; As Weinan pointed out, it should be KERN_ERR. return 1; } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 09/16] PCI/AER: Update statistics early in logging

2025-05-19 Thread Sathyanarayanan Kuppuswamy
ing[info->severity]); @@ -782,6 +783,8 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity, info.status = status; info.mask = mask; + pci_dev_aer_stats_incr(dev, &info); + layer = AER_GET_LAYER_ERROR(aer_severity, status); agent = AER_GET_AGENT(aer_severi

Re: [PATCH v6 10/16] PCI/AER: Combine trace_aer_event() with statistics updates

2025-05-19 Thread Sathyanarayanan Kuppuswamy
pcie_print_tlp_log(dev, &aer->header_log, dev_fmt(" ")); - - trace_aer_event(pci_name(dev), (status & ~mask), - aer_severity, tlp_header_valid, &aer->header_log); } EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL"); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 08/16] PCI/AER: Simplify pci_print_aer()

2025-05-19 Thread Sathyanarayanan Kuppuswamy
er_log, dev_fmt(" ")); - trace_aer_event(dev_name(&dev->dev), (status & ~mask), + trace_aer_event(pci_name(dev), (status & ~mask), aer_severity, tlp_header_valid, &aer->header_log); } EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL"); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 07/16] PCI/AER: Initialize aer_err_info before using it

2025-05-19 Thread Sathyanarayanan Kuppuswamy
.id = ERR_UNCOR_ID(e_src->id), + .severity = fatal ? AER_FATAL : AER_NONFATAL, + .multi_error_valid = multi ? 1 : 0, + }; if (find_source_device(pdev, &e_info)) { aer_print_source(pdev, &e_info, ""); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 06/16] PCI/AER: Move aer_print_source() earlier in file

2025-05-19 Thread Sathyanarayanan Kuppuswamy
or_valid ? "Multiple " : "", -aer_error_severity_string[info->severity], -pci_domain_nr(dev->bus), PCI_BUS_NUM(source), -PCI_SLOT(source), PCI_FUNC(source), details); -} - #ifdef CONFIG_ACPI_APEI_PCIEAER int cper_severity_to_aer(int cper_severity) { -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 05/16] PCI/AER: Rename aer_print_port_info() to aer_print_source()

2025-05-19 Thread Sathyanarayanan Kuppuswamy
aer_print_source(pdev, &e_info, ""); aer_process_err_devices(&e_info); } } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 04/16] PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc

2025-05-19 Thread Sathyanarayanan Kuppuswamy
ource), PCI_FUNC(source), details); } #ifdef CONFIG_ACPI_APEI_PCIEAER -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 03/16] PCI/AER: Consolidate Error Source ID logging in aer_print_port_info()

2025-05-19 Thread Sathyanarayanan Kuppuswamy
tus & PCI_ERR_ROOT_UNCOR_RCV) { @@ -1316,10 +1318,10 @@ static void aer_isr_one_error(struct aer_rpc *rpc, else e_info.multi_error_valid = 0; - aer_print_port_info(pdev, &e_info); - - if (find_source_device(pdev, &e_info)) +

Re: [PATCH v6 02/16] PCI/DPC: Log Error Source ID only when valid

2025-05-19 Thread Sathyanarayanan Kuppuswamy
t_reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_SW_TRIGGER) ? + "software trigger" : +"reserved error"); + break; + } /* show RP PIO error detail information */ if (pdev->dpc_rp_extensions && -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v6 01/16] PCI/DPC: Initialize aer_err_info before using it

2025-05-19 Thread Sathyanarayanan Kuppuswamy
struct aer_err_info info; + struct aer_err_info info = { 0 }; pci_read_config_word(pdev, cap + PCI_EXP_DPC_STATUS, &status); pci_read_config_word(pdev, cap + PCI_EXP_DPC_SOURCE_ID, &source); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH 3/4] PCI/AER: Expose AER panic state via pci_aer_panic_enabled()

2025-05-16 Thread Sathyanarayanan Kuppuswamy
I bridge quirks currently, right? If yes, just list what is currently supported. + */ +bool pci_aer_panic_enabled(void) +{ + return pcie_aer_panic; +} +EXPORT_SYMBOL(pci_aer_panic_enabled); + bool pci_aer_available(void) { return !pcie_aer_disable && pci_msi_enabled(

Re: [PATCH 0/4] pci: implement "pci=aer_panic"

2025-05-16 Thread Sathyanarayanan Kuppuswamy
ce2d0116e6 prerequisite-patch-id: 482ad0609459a7654a4100cdc9f9aa4b671be50b -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v4 2/5] PCI/ERR: Add support for resetting the slots in a platform specific way

2025-05-14 Thread Sathyanarayanan Kuppuswamy
ned intignore_reset_delay:1; /* For entire hierarchy */ unsigned intno_ext_tags:1; /* No Extended Tags */ -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v4 1/5] PCI/ERR: Remove misleading TODO regarding kernel panic

2025-05-14 Thread Sathyanarayanan Kuppuswamy
+271,6 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT); - /* TODO: Should kernel panic here? */ pci_info(bridge, "device recovery failed\n"); return status; -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v4 3/3] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd

2025-03-02 Thread Sathyanarayanan Kuppuswamy
pcie_clear_device_status(dev); pci_aer_clear_nonfatal_status(dev); + pci_aer_clear_fatal_status(dev); Add some info about above change in the commit log. } pci_walk_bridge(bridge, pci_pm_runtime_put, NULL); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v4 2/3] PCI/DPC: Run recovery on device that detected the error

2025-03-02 Thread Sathyanarayanan Kuppuswamy
zen, dpc_reset_link); send_ost: @@ -216,6 +216,7 @@ static void edr_handle_event(acpi_handle handle, u32 event, void *data) } pci_dev_put(err_port); + pci_dev_put(err_dev); } void pci_acpi_add_edr_notifier(struct pci_dev *pdev) -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3 3/4] PCI/DPC: Run recovery on device that detected the error

2025-02-11 Thread Sathyanarayanan Kuppuswamy
7 @@ static void edr_handle_event(acpi_handle handle, u32 event, void *data) } pci_dev_put(err_port); + pci_dev_put(err_dev); } void pci_acpi_add_edr_notifier(struct pci_dev *pdev) -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3 1/4] PCI/DPC: Rename pdev to err_port for dpc_handler

2025-02-11 Thread Sathyanarayanan Kuppuswamy
ozen, dpc_reset_link); return IRQ_HANDLED; } -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 2/2] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd

2025-01-23 Thread Sathyanarayanan Kuppuswamy
On 1/23/25 5:45 PM, Shuai Xue wrote: 在 2025/1/24 04:10, Sathyanarayanan Kuppuswamy 写道: Hi, On 11/12/24 5:54 AM, Shuai Xue wrote: The AER driver has historically avoided reading the configuration space of an endpoint or RCiEP that reported a fatal error, considering the link to that

Re: [PATCH v2 2/2] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd

2025-01-23 Thread Sathyanarayanan Kuppuswamy
); } @@ -259,6 +267,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, if (host->native_aer || pcie_ports_native) { pcie_clear_device_status(dev); pci_aer_clear_nonfatal_status(dev); + pci_aer_clear_fatal_status(dev); I think we clear fatal status in DPC driver, why do it again? } pci_walk_bridge(bridge, pci_pm_runtime_put, NULL); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 1/2] PCI/DPC: Run recovery on device that detected the error

2025-01-22 Thread Sathyanarayanan Kuppuswamy
} else { - pci_dbg(edev, "DPC port recovery failed\n"); - acpi_send_edr_status(pdev, edev, EDR_OST_FAILED); + pci_dbg(err_port, "DPC port recovery failed\n"); + acpi_send_edr_status(pdev, err_port, EDR_OST_FAILED); } - pci_dev_put(edev); + pci_dev_put(err_port); + pci_dev_put(err_dev); } void pci_acpi_add_edr_notifier(struct pci_dev *pdev) -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH 0/2] PCI/AER: Remove/unexport error reporting enable/disable

2023-07-10 Thread Sathyanarayanan Kuppuswamy
> Bjorn Helgaas (2): > PCI/AER: Drop unused pci_disable_pcie_error_reporting() > PCI/AER: Unexport pci_enable_pcie_error_reporting() > > drivers/pci/pcie/aer.c | 15 +-- > include/linux/aer.h| 11 --- > 2 files changed, 1 insertion(+), 25 deletions(-) &

Re: [PATCH v5 2/3] PCI/AER: Disable AER interrupt on suspend

2023-05-11 Thread Sathyanarayanan Kuppuswamy
reset Root Port hierarchy, RCEC, or RCiEP > * @dev: pointer to Root Port, RCEC, or RCiEP > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = { > .service= PCIE_PORT_SERVICE_AER, > > .probe = aer_probe, > + .suspend= aer_suspend, > + .resume = aer_resume, > .remove = aer_remove, > }; > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

2023-04-24 Thread Sathyanarayanan Kuppuswamy
Hi, On 4/24/23 10:55 PM, Kai-Heng Feng wrote: > On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy > wrote: >> >> >> >> On 4/23/23 10:52 PM, Kai-Heng Feng wrote: >>> PCIe service that shares IRQ with PME may cause spurious wakeup on >>> sys

Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

2023-04-24 Thread Sathyanarayanan Kuppuswamy
= PCIE_PORT_SERVICE_AER, > > .probe = aer_probe, > + .suspend= aer_suspend, > + .resume = aer_resume, > .remove = aer_remove, > }; > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3 3/4] PCI/AER: Disable AER interrupt on suspend

2023-04-20 Thread Sathyanarayanan Kuppuswamy
t, RCEC, or RCiEP > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = { > .service= PCIE_PORT_SERVICE_AER, > > .probe = aer_probe, > + .suspend= aer_suspend, > + .resume = aer_resume, > .remove = aer_remove, > }; > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3 2/4] PCI/AER: Factor out interrupt toggling into helpers

2023-04-20 Thread Sathyanarayanan Kuppuswamy
t aer_root_reset(struct pci_dev > *dev) > pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, ®32); > pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32); > > - /* Enable Root Port's interrupt in response to error messages */ >

Re: [PATCHv2 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO

2023-03-17 Thread Sathyanarayanan Kuppuswamy
quot;, > - aer_error_layer[layer], aer_agent_string[agent]); > > - if (aer_severity != AER_CORRECTABLE) > + if (aer_severity == AER_CORRECTABLE) { > + pci_info(dev, "aer_layer=%s, aer_agent=%s\n", > + aer_error_layer[layer], aer_agent_string[agent]); > + } else { > + pci_err(dev, "aer_layer=%s, aer_agent=%s\n", > + aer_error_layer[layer], aer_agent_string[agent]); > pci_err(dev, "aer_uncor_severity: 0x%08x\n", > aer->uncor_severity); > + } > > if (tlp_header_valid) > __print_tlp_header(dev, &aer->header_log); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH V1] PCI/AER: Configure ECRC only AER is native

2023-01-11 Thread Sathyanarayanan Kuppuswamy
On 1/11/23 8:59 PM, Vidya Sagar wrote: > > > On 1/12/2023 9:18 AM, Sathyanarayanan Kuppuswamy wrote: >> External email: Use caution opening links or attachments >> >> >> On 1/11/23 7:33 PM, Vidya Sagar wrote: >>> I think we still need bios op

Re: [PATCH V1] PCI/AER: Configure ECRC only AER is native

2023-01-11 Thread Sathyanarayanan Kuppuswamy
agree that "on" and "off" option makes sense. Since the kernel defaults ecrc setting to "bios", why again allow it as a command line option? -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH V1] PCI/AER: Configure ECRC only AER is native

2023-01-11 Thread Sathyanarayanan Kuppuswamy
Hi, On 1/11/23 3:10 PM, Bjorn Helgaas wrote: > On Wed, Jan 11, 2023 at 01:42:21PM -0800, Sathyanarayanan Kuppuswamy wrote: >> On 1/11/23 12:31 PM, Vidya Sagar wrote: >>> As the ECRC configuration bits are part of AER registers, configure >>> ECRC only if AER is n

Re: [PATCH V1] PCI/AER: Configure ECRC only AER is native

2023-01-11 Thread Sathyanarayanan Kuppuswamy
ing(struct pci_dev *dev) > { > + if (!pcie_aer_is_native(dev)) > + return; > + > switch (ecrc_policy) { > case ECRC_POLICY_DEFAULT: > return; -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [External] Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()

2022-09-27 Thread Sathyanarayanan Kuppuswamy
On 9/27/22 9:20 PM, Zhuo Chen wrote: > > > On 9/28/22 3:39 AM, Sathyanarayanan Kuppuswamy wrote: >> >> >> On 9/27/22 8:35 AM, Zhuo Chen wrote: >>> Status bits for ERR_NONFATAL errors only are cleared in >>> pci_aer_clear_nonfatal_status(), but we

Re: [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status()

2022-09-27 Thread Sathyanarayanan Kuppuswamy
int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev); > void pci_save_aer_state(struct pci_dev *dev); > void pci_restore_aer_state(struct pci_dev *dev); > @@ -57,10 +56,6 @@ static inline int pci_disable_pcie_error_reporting(struct > pci_dev *dev) > { > return -EINV

Re: [PATCH v2 4/9] scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()

2022-09-27 Thread Sathyanarayanan Kuppuswamy
@@ -4715,7 +4715,7 @@ lpfc_aer_cleanup_state(struct device *dev, struct > device_attribute *attr, > return -EINVAL; > > if (phba->hba_flag & HBA_AER_ENABLED) > - rc = pci_aer_clear_nonfatal_status(phba->pcidev); > + rc = pci_aer_clear_uncor

Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()

2022-09-27 Thread Sathyanarayanan Kuppuswamy
fatal_status(pdev); > + else /* Cleanup uncorrectable error status before getting to init */ > + pci_aer_clear_uncorrect_error_status(pdev); > > /* First enable the PCI device */ > ret = pcim_enable_device(pdev); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core

2022-09-27 Thread Sathyanarayanan Kuppuswamy
; +static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev) > +{ > + return -EINVAL; > +} > static inline void pci_save_aer_state(struct pci_dev *dev) {} > static inline void pci_restore_aer_state(struct pci_dev *dev) {} > #endif -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status

2022-09-27 Thread Sathyanarayanan Kuppuswamy
pdev, &info); > - pci_aer_clear_nonfatal_status(pdev); > - pci_aer_clear_fatal_status(pdev); > + pci_aer_clear_uncorrect_error_status(pdev); > } > } > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3] PCI/ERR: Use pcie_aer_is_native() to judge whether OS owns AER

2022-08-02 Thread Sathyanarayanan Kuppuswamy
tive_aer || pcie_ports_native)". > > Or we can change "if ((host->native_aer || pcie_ports_native) && aer)" into > "if (pcie_aer_is_native(root))". But in this way, argument NULL pointer check  > should be added in pcie_aer_is_native(). Looking into it again, I think it is better to leave it as it is. Please ignore my comment. -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3] PCI/ERR: Use pcie_aer_is_native() to judge whether OS owns AER

2022-07-26 Thread Sathyanarayanan Kuppuswamy
@ -221,8 +221,7 @@ static int get_port_device_capability(struct pci_dev *dev) > } > > #ifdef CONFIG_PCIEAER > - if (dev->aer_cap && pci_aer_available() && > - (pcie_ports_native || host->native_aer)) { > + if (pcie_aer_is_native(dev) && pci_aer_available()) { > services |= PCIE_PORT_SERVICE_AER; > > /* -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2] PCI/ERR: Use pcie_aer_is_native() to judge whether OS owns AER

2022-07-26 Thread Sathyanarayanan Kuppuswamy
rs/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c > index 604feeb84ee4..98c18f4a01b2 100644 > --- a/drivers/pci/pcie/portdrv_core.c > +++ b/drivers/pci/pcie/portdrv_core.c > @@ -221,8 +221,7 @@ static int get_port_device_capability(struct pci_dev *dev) > } > > #

Re: [PATCH] PCI/ERR: Use pcie_aer_is_native() to judge whether OS owns AER

2022-07-25 Thread Sathyanarayanan Kuppuswamy
int get_port_device_capability(struct pci_dev *dev) > } > > #ifdef CONFIG_PCIEAER > - if (dev->aer_cap && pci_aer_available() && > - (pcie_ports_native || host->native_aer)) { > + if (pcie_aer_is_native(dev) && pci_aer_available()) { > services |= PCIE_PORT_SERVICE_AER; > > /* -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH] PCI/ERR: handle disconnected devices in report_error_detected

2022-06-07 Thread Sathyanarayanan Kuppuswamy
PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-05-11 Thread Sathyanarayanan Kuppuswamy
return IRQ_NONE; + mdelay(5000); -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v3] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-05-11 Thread Sathyanarayanan Kuppuswamy
elated. I think the actual change of interest is e167bfcaa4cd ("PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS") [1]. It looks like we did exactly what you propose before that commit. I can update this unless you disagree. [1]https://git.kernel.org/linus/e167bfcaa4cd Agree. Please

Re: [PATCH v3] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-05-11 Thread Sathyanarayanan Kuppuswamy
_err_source e_src = {}; pci_read_config_dword(rp, aer + PCI_ERR_ROOT_STATUS, &e_src.status); - if (!(e_src.status & (PCI_ERR_ROOT_UNCOR_RCV|PCI_ERR_ROOT_COR_RCV))) + if (!(e_src.status & AER_ERR_STATUS_MASK)) return IRQ_NONE; pci_read_config_d

Re: [PATCH v4 2/2] PCI/DPC: Disable DPC service when link is in L2/L3 ready, L2 and L3 state

2022-04-17 Thread Sathyanarayanan Kuppuswamy
nce AER is disabled in previous patch for a Link in L2/L3 Ready, L2 and L3, also disable DPC here as DPC depends on AER to work. Bugzilla:https://bugzilla.kernel.org/show_bug.cgi?id=215453 Reviewed-by: Mika Westerberg Signed-off-by: Kai-Heng Feng Reviewed-by: Kuppuswamy Sathyanarayanan -- Sathy

Re: [PATCH v3 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state

2022-03-30 Thread Sathyanarayanan Kuppuswamy
= aer_suspend, + .resume = aer_resume, + .runtime_suspend= aer_suspend, + .runtime_resume = aer_resume, + .remove = aer_remove, }; /** -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state

2022-03-20 Thread Sathyanarayanan Kuppuswamy
On 3/20/22 7:38 PM, Kai-Heng Feng wrote: On Sun, Mar 20, 2022 at 4:38 AM Sathyanarayanan Kuppuswamy wrote: On 1/26/22 6:54 PM, Kai-Heng Feng wrote: Commit 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint") enables ACS, and some platforms lose its NVMe after r

Re: [PATCH v2 2/2] PCI/DPC: Disable DPC service when link is in L2/L3 ready, L2 and L3 state

2022-03-19 Thread Sathyanarayanan Kuppuswamy
ume, + .runtime_suspend= dpc_suspend, + .runtime_resume = dpc_resume, + .remove = dpc_remove, }; int __init pcie_dpc_init(void) -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v2 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state

2022-03-19 Thread Sathyanarayanan Kuppuswamy
= aer_suspend, + .resume = aer_resume, + .runtime_suspend= aer_suspend, + .runtime_resume = aer_resume, + .remove = aer_remove, }; /** -- Sathyanarayanan Kuppuswamy Linux Kernel Developer

Re: [PATCH v1] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-03-18 Thread Sathyanarayanan Kuppuswamy
& AER_ERR_STATUS_MASK)) return IRQ_NONE; - pci_read_config_dword(rp, aer + PCI_ERR_ROOT_ERR_SRC, &e_src.id); pci_write_config_dword(rp, aer + PCI_ERR_ROOT_STATUS, e_src.status); + pci_read_config_dword(rp, aer + PCI_ERR_ROOT_ERR_SRC, &e_src.id) -- Sat

Re: [PATCH v2] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-03-15 Thread Sathyanarayanan Kuppuswamy
On 3/15/22 12:52 PM, Eric Badger wrote: On Tue, Mar 15, 2022 at 10:26:46AM -0700, Sathyanarayanan Kuppuswamy wrote: On 3/15/22 10:14 AM, Eric Badger wrote: # Prep injection data for a correctable error. $ cd /sys/kernel/debug/apei/einj $ echo 0x0040 > error_type $ echo

Re: [PATCH v2] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-03-15 Thread Sathyanarayanan Kuppuswamy
he last 3 steps with following? # Inject another error (within 5 seconds) $ echo 1 > error_inject # You will get a new IRQ with only multiple ERR_COR bit set pcieport : AER: Root Error Status 0002 -- Sathyanarayanan Kuppuswamy Linux Kernel Developer