** Description changed: [Impact] Error records which have multiple errors in them will incorrectly report all errors after the first one. This results in garbage non-standard error trace events to be generated, and for AER and MC errors there will be no kernel action to help recover from these errors in the AER and EDAC drivers. [Fix] Patches in Linus tree fixes this issue: aaf2c2fb0f51 ACPI / APEI: clear error status before acknowledging the error c4335fdd3822 ACPI: APEI: fix the wrong iteration of generic error status block [Testing] Insert a e1000 pcie card into the system, run the following command that should generate PCIe correctable errors, you will see only the first error in each GHES report go to the AER driver rather than all errors from the GHES reports. $ sudo setpci -s 0002:00:00.0 0x70c.l=0x00808000;sudo setpci -s 0002:00:00.0 CAP_EXP+0x10.B=0x4b;sleep 1;sudo setpci -s 0002:00:00.0 CAP_EXP+0x10.B=0x48 Where "0002:00:00.0" being the root hub for the card. + Used JTAG to trigger multiple concurrent errors, and observed that all + errors were parsed, instead of just the first one. As mentioned in + comment #3. So, the poster of comment #3 will do the verification once + the patch lands in -proposed. + [Regression Potential] The two patches to ACPI APEI driver was cleanly cherry picked from linus's tree and applied to Artful and Zesty. The patches were tested on QDF2400 platform where it was found to issue and don't introduce any regressions.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1732990 Title: [Artful/Zesty] ACPI APEI error handling bug fixes Status in linux package in Ubuntu: In Progress Status in linux source package in Zesty: In Progress Status in linux source package in Artful: In Progress Bug description: [Impact] Error records which have multiple errors in them will incorrectly report all errors after the first one. This results in garbage non-standard error trace events to be generated, and for AER and MC errors there will be no kernel action to help recover from these errors in the AER and EDAC drivers. [Fix] Patches in Linus tree fixes this issue: aaf2c2fb0f51 ACPI / APEI: clear error status before acknowledging the error c4335fdd3822 ACPI: APEI: fix the wrong iteration of generic error status block [Testing] Insert a e1000 pcie card into the system, run the following command that should generate PCIe correctable errors, you will see only the first error in each GHES report go to the AER driver rather than all errors from the GHES reports. $ sudo setpci -s 0002:00:00.0 0x70c.l=0x00808000;sudo setpci -s 0002:00:00.0 CAP_EXP+0x10.B=0x4b;sleep 1;sudo setpci -s 0002:00:00.0 CAP_EXP+0x10.B=0x48 Where "0002:00:00.0" being the root hub for the card. Used JTAG to trigger multiple concurrent errors, and observed that all errors were parsed, instead of just the first one. As mentioned in comment #3. So, the poster of comment #3 will do the verification once the patch lands in -proposed. [Regression Potential] The two patches to ACPI APEI driver was cleanly cherry picked from linus's tree and applied to Artful and Zesty. The patches were tested on QDF2400 platform where it was found to issue and don't introduce any regressions. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1732990/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp