** Changed in: linux (Ubuntu Disco)
Status: In Progress => Fix Committed
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1857413
Title:
mce: ras: When inject 1bit ecc error, there is no mce log recorded
in the dmesg
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Disco:
Fix Committed
Bug description:
== SRU Justification ==
With the 5.0 Disco kernel, the kernel cannot record the mce log while
injecting 1bit ecc error.
== Fix ==
* 09cbd219 (RAS/CEC: Increment cec_entered under the mutex lock)
* de0e0624 (RAS/CEC: Check count_threshold unconditionally)
Commit de0e0624 is the real fix for this issue, 09cbd219 is a fix to
avoid race condition, and it can make the latter become a clean
cherry-pick.
These have been landed on newer kernels.
== Test ==
Test kernel could be found here:
https://people.canonical.com/~phlin/kernel/lp-1857413-ras-err-msg/
Verified by the bug reporter, fan jinke, the patched kernel can log
the error correctly.
== Regression Potential ==
Low, changes are limited to the RAS Correctable Errors Collector. And
the fix has been verified as working as expected.
== Original Bug Report ==
Using Linux kernel, When inject 1bit ecc error, there are some mce log
recorded in the dmesg.like:
[ 1561.511210] mce: [Hardware Error]: Machine check events logged
[ 1561.511221] [Hardware Error]: Corrected error, no action required.
[ 1561.511311] [Hardware Error]: CPU:0 (18:0:2)
MC16_STATUS[Over|CE|MiscV|-|AddrV|-|-|SyndV|-|CECC]: 0xdc2040000000011b
[ 1561.511388] [Hardware Error]: Error Addr: 0x000000077cd66940
[ 1561.511439] [Hardware Error]: IPID: 0x0000009600150f00, Syndrome:
0x000010ce0a400d01
[ 1561.511499] [Hardware Error]: Unified Memory Controller Extended Error
Code: 0
[ 1561.511556] [Hardware Error]: Unified Memory Controller Error: DRAM ECC
error.
[ 1561.511646] EDAC MC0: 1 CE on mc#0csrow#1channel#1 (csrow:1 channel:1
page:0x7fcd66 offset:0x940 grain:0 syndrome:0x10ce)
[ 1561.511648] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD
*But, there are no the log when Using "Ubuntu 18.04.3 LTS"*
The upstream related commit is
de0e0624d86ff9fc512dedb297f8978698abf21a .
After merged this commit, Ubuntu kernel's dmesg can record the mce log as
well.
---
ProblemType: Bug
AlsaDevices:
total 0
crw-rw----+ 1 root audio 116, 1 Dec 24 17:20 seq
crw-rw----+ 1 root audio 116, 33 Dec 24 17:20 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.10-0ubuntu27
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord':
'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 19.04
InstallationDate: Installed on 2019-12-24 (0 days ago)
InstallationMedia: Ubuntu-Server 19.04 "Disco Dingo" - Release amd64
(20190416.1)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
MachineType: Sugon HygonH210
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=linux
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.0.0-13-generic
root=UUID=43f8bc11-d850-4e79-9d14-1232ef50040f ro
ProcVersionSignature: Ubuntu 5.0.0-13.14-generic 5.0.6
RelatedPackageVersions:
linux-restricted-modules-5.0.0-13-generic N/A
linux-backports-modules-5.0.0-13-generic N/A
linux-firmware 1.178
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: disco
Uname: Linux 5.0.0-13-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 03/15/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 210ER119
dmi.board.asset.tag: Default string
dmi.board.name: HygonH210
dmi.board.vendor: Sugon
dmi.board.version: Default string
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 17
dmi.chassis.vendor: Sugon
dmi.chassis.version: Default string
dmi.modalias:
dmi:bvnAmericanMegatrendsInc.:bvr210ER119:bd03/15/2019:svnSugon:pnHygonH210:pvrDefaultstring:rvnSugon:rnHygonH210:rvrDefaultstring:cvnSugon:ct17:cvrDefaultstring:
dmi.product.family: Rack
dmi.product.name: HygonH210
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Sugon
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857413/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp