Testing results on QDF2400 showing a recoverable DDR error, correctable
vendor specific error, correctable ARM cache error, and fatal vendor
specific error. All functionality appears to be working properly.

ubuntu@null-8cfdf006a3ef:~$ uname -a
Linux null-8cfdf006a3ef 4.10.0-29-generic #33~lp1706141+build.2-Ubuntu SMP Tue 
Jul 25 19:12:22 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux


ubuntu@null-8cfdf006a3ef:~$ dmesg | grep -i -E 'hest|ghes|edac|hardware'
[    0.000000] ACPI: HEST 0x0000000008A60000 000288 (v01 QCOM   QDF2400  
00000001 INTL 20150515)
[    0.538984] HEST: Table parsing has been initialized.
[    3.854385] EDAC MC: Ver: 3.0.0
[    5.537078] ghes_edac: This EDAC driver relies on BIOS to enumerate memory 
and get error reports.
[    5.545952] ghes_edac: Unfortunately, not all BIOSes reflect the memory 
layout correctly.
[    5.554123] ghes_edac: So, the end result of using this driver varies from 
vendor to vendor.
[    5.562555] ghes_edac: If you find incorrect reports, please contact your 
hardware vendor
[    5.570727] ghes_edac: to correct its BIOS.
[    5.574905] ghes_edac: This system has 6 DIMM sockets.
[    5.580205] EDAC MC0: Giving out device to module ghes_edac.c controller 
ghes_edac: DEV ghes (INTERRUPT)
[    5.589763] EDAC MC1: Giving out device to module ghes_edac.c controller 
ghes_edac: DEV ghes (INTERRUPT)
[    5.599319] EDAC MC2: Giving out device to module ghes_edac.c controller 
ghes_edac: DEV ghes (INTERRUPT)
[    5.608867] EDAC MC3: Giving out device to module ghes_edac.c controller 
ghes_edac: DEV ghes (INTERRUPT)
[    5.618416] EDAC MC4: Giving out device to module ghes_edac.c controller 
ghes_edac: DEV ghes (INTERRUPT)
[    5.628018] GHES: APEI firmware first mode is enabled by APEI bit and WHEA 
_OSC.
[    6.573372] qcom-emac QCOM8070:00 eth0: hardware id 64.1, hardware version 
1.3.0
[  224.669058] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 1
[  224.677330] {1}[Hardware Error]: event severity: recoverable
[  224.682992] {1}[Hardware Error]:  precise tstamp: 2017-07-26 15:58:19
[  224.689437] {1}[Hardware Error]:  Error 0, type: recoverable
[  224.695097] {1}[Hardware Error]:   section_type: memory error
[  224.700846] {1}[Hardware Error]:   error_status: 0x00000000000c0400
[  224.707113] {1}[Hardware Error]:   physical_address: 0x0000000000204e10
[  224.713726] {1}[Hardware Error]:   physical_address_mask: 0x00000fffffffffff
[  224.720776] {1}[Hardware Error]:   node: 0 card: 1 module: 0 rank: 0 bank: 0 
device: 0 row: 4 column: 306
[  224.730427] {1}[Hardware Error]:   error_type: 3, multi-bit ECC
[  224.736356] EDAC MC0: 1 UE Multi-bit ECC on unknown label (node:0 card:1 
module:0 rank:0 bank:0 row:4 col:306 page:0x204 offset:0xe10 grain:-4096 - 
status(0x00000000000c0400): Storage error in DRAM memory)
[  224.736358] [Firmware Warn]: GHES: Invalid address in generic error data: 
0x204e10
[  251.685322] {2}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 2
[  251.685324] {2}[Hardware Error]: It has been corrected by h/w and requires 
no further action
[  251.685336] {2}[Hardware Error]: event severity: corrected
[  251.685341] {2}[Hardware Error]:  precise tstamp: 2017-07-26 15:58:30
[  251.685342] {2}[Hardware Error]:  Error 0, type: corrected
[  251.685348] {2}[Hardware Error]:   section type: unknown, 
d2e2621c-f936-468d-0d84-15a4ed015c8b
[  251.685349] {2}[Hardware Error]:   section length: 0x238
[  251.685355] {2}[Hardware Error]:   00000000: 4d415201 4d492031 453a4d45 
435f4343  .RAM1 IMEM:ECC_C
[  251.685358] {2}[Hardware Error]:   00000010: 53515f45 44525f42 00000000 
00000000  E_QSB_RD........
[  251.685361] {2}[Hardware Error]:   00000020: 00000000 00000000 00000000 
00000000  ................
[  251.685364] {2}[Hardware Error]:   00000030: 00000000 00000000 01010000 
01010000  ................
[  251.685367] {2}[Hardware Error]:   00000040: 00000000 00000000 00000005 
00000000  ................
[  251.685369] {2}[Hardware Error]:   00000050: 01010000 00000000 00000001 
00010100  ................
[  251.685372] {2}[Hardware Error]:   00000060: 00000000 00000000 00000000 
00000000  ................
[  251.685375] {2}[Hardware Error]:   00000070: 00000000 00000000 00000000 
00000000  ................
[  251.685378] {2}[Hardware Error]:   00000080: 00000000 00000000 00000000 
00000000  ................
[  251.685381] {2}[Hardware Error]:   00000090: 00000000 00000000 00000000 
00000000  ................
[  251.685384] {2}[Hardware Error]:   000000a0: 00000000 00000000 00000000 
00000000  ................
[  251.685387] {2}[Hardware Error]:   000000b0: 00000000 00000000 00000000 
00000000  ................
[  251.685389] {2}[Hardware Error]:   000000c0: 00000000 00000000 00000000 
00000000  ................
[  251.685392] {2}[Hardware Error]:   000000d0: 00000000 00000000 00000000 
00000000  ................
[  251.685395] {2}[Hardware Error]:   000000e0: 00000000 00000000 00000000 
00000000  ................
[  251.685398] {2}[Hardware Error]:   000000f0: 00000000 00000000 00000000 
00000000  ................
[  251.685402] {2}[Hardware Error]:   00000100: 00000000 00000000 00000000 
00000000  ................
[  251.685405] {2}[Hardware Error]:   00000110: 00000000 00000000 00000000 
00000000  ................
[  251.685408] {2}[Hardware Error]:   00000120: 00000000 00000000 00000000 
00000000  ................
[  251.685410] {2}[Hardware Error]:   00000130: 00000000 00000000 00000000 
00000000  ................
[  251.685413] {2}[Hardware Error]:   00000140: 00000000 00000000 00000000 
00000000  ................
[  251.685416] {2}[Hardware Error]:   00000150: 00000000 00000000 00000000 
00000000  ................
[  251.685419] {2}[Hardware Error]:   00000160: 00000000 00000000 00000000 
00000000  ................
[  251.685423] {2}[Hardware Error]:   00000170: 00000000 00000000 00000000 
00000000  ................
[  251.685426] {2}[Hardware Error]:   00000180: 00000000 00000000 00000000 
00000000  ................
[  251.685429] {2}[Hardware Error]:   00000190: 00000000 00000000 00000000 
00000000  ................
[  251.685432] {2}[Hardware Error]:   000001a0: 00000000 00000000 00000000 
00000000  ................
[  251.685434] {2}[Hardware Error]:   000001b0: 00000000 00000000 00000000 
00000000  ................
[  251.685437] {2}[Hardware Error]:   000001c0: 00000000 00000000 00000000 
00000000  ................
[  251.685440] {2}[Hardware Error]:   000001d0: 00000000 00000000 00000000 
00000000  ................
[  251.685443] {2}[Hardware Error]:   000001e0: 00000000 00000000 00000000 
00000000  ................
[  251.685446] {2}[Hardware Error]:   000001f0: 00000000 00000000 00000000 
00000000  ................
[  251.685449] {2}[Hardware Error]:   00000200: 00000000 00000000 00000000 
00000000  ................
[  251.685451] {2}[Hardware Error]:   00000210: 00000000 00000000 00000000 
00000000  ................
[  251.685454] {2}[Hardware Error]:   00000220: 00000000 00000000 00000000 
00000000  ................
[  251.685457] {2}[Hardware Error]:   00000230: 00000000 00000000               
     ........
[  357.701494] {3}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 2
[  357.701496] {3}[Hardware Error]: event severity: info
[  357.701508] {3}[Hardware Error]:  precise tstamp: 2017-07-26 16:00:12
[  357.701510] {3}[Hardware Error]:  Error 0, type: info
[  357.701513] {3}[Hardware Error]:   section_type: ARM processor error
[  357.701515] {3}[Hardware Error]:   MIDR: 0x00000000510f8000
[  357.701518] {3}[Hardware Error]:   Multiprocessor Affinity Register (MPIDR): 
0x0000000000000000
[  357.701520] {3}[Hardware Error]:   error affinity level: 2
[  357.701522] {3}[Hardware Error]:   running state: 0x1
[  357.701524] {3}[Hardware Error]:   Power State Coordination Interface state: 0
[  357.701527] {3}[Hardware Error]:   Error info structure 0:
[  357.701529] {3}[Hardware Error]:   num errors: 1
[  357.701531] {3}[Hardware Error]:    first error captured
[  357.701533] {3}[Hardware Error]:    last error captured
[  357.701535] {3}[Hardware Error]:    error_type: 0, cache error
[  357.701538] {3}[Hardware Error]:    error_info: 0x0000000000c20058
ubuntu@null-8cfdf006a3ef:~$
ubuntu@null-8cfdf006a3ef:~$ [  403.857832] {4}[Hardware Error]: Hardware error 
from APEI Generic Hardware Error Source: 1
[  403.866103] {4}[Hardware Error]: event severity: fatal
[  403.871244] {4}[Hardware Error]:  precise tstamp: 2017-07-26 16:01:18
[  403.877690] {4}[Hardware Error]:  Error 0, type: fatal
[  403.882831] {4}[Hardware Error]:   section type: unknown, 
d2e2621c-f936-468d-0d84-15a4ed015c8b
[  403.891445] {4}[Hardware Error]:   section length: 0x238
[  403.896762] {4}[Hardware Error]:   00000000: 4d415201 4d492031 453a4d45 
555f4343  .RAM1 IMEM:ECC_U
[  403.905721] {4}[Hardware Error]:   00000010: 53515f45 44525f42 00000000 
00000000  E_QSB_RD........
[  403.914682] {4}[Hardware Error]:   00000020: 00000000 00000000 00000000 
00000000  ................
[  403.923644] {4}[Hardware Error]:   00000030: 00000000 00000000 01010000 
01010000  ................
[  403.932605] {4}[Hardware Error]:   00000040: 00000000 00000000 00000005 
00000000  ................
[  403.941566] {4}[Hardware Error]:   00000050: 02020000 00000000 00000001 
00c6c600  ................
[  403.950531] {4}[Hardware Error]:   00000060: 00000000 00000000 00000000 
00000000  ................
[  403.959489] {4}[Hardware Error]:   00000070: 00000000 00000000 00000000 
00000000  ................
[  403.968450] {4}[Hardware Error]:   00000080: 00000000 00000000 00000000 
00000000  ................
[  403.977413] {4}[Hardware Error]:   00000090: 00000000 00000000 00000000 
00000000  ................
[  403.986374] {4}[Hardware Error]:   000000a0: 00000000 00000000 00000000 
00000000  ................
[  403.995339] {4}[Hardware Error]:   000000b0: 00000000 00000000 00000000 
00000000  ................
[  404.004302] {4}[Hardware Error]:   000000c0: 00000000 00000000 00000000 
00000000  ................
[  404.013263] {4}[Hardware Error]:   000000d0: 00000000 00000000 00000000 
00000000  ................
[  404.022223] {4}[Hardware Error]:   000000e0: 00000000 00000000 00000000 
00000000  ................
[  404.031183] {4}[Hardware Error]:   000000f0: 00000000 00000000 00000000 
00000000  ................
[  404.040143] {4}[Hardware Error]:   00000100: 00000000 00000000 00000000 
00000000  ................
[  404.049104] {4}[Hardware Error]:   00000110: 00000000 00000000 00000000 
00000000  ................
[  404.058064] {4}[Hardware Error]:   00000120: 00000000 00000000 00000000 
00000000  ................
[  404.067025] {4}[Hardware Error]:   00000130: 00000000 00000000 00000000 
00000000  ................
[  404.075986] {4}[Hardware Error]:   00000140: 00000000 00000000 00000000 
00000000  ................
[  404.084946] {4}[Hardware Error]:   00000150: 00000000 00000000 00000000 
00000000  ................
[  404.093907] {4}[Hardware Error]:   00000160: 00000000 00000000 00000000 
00000000  ................
[  404.102867] {4}[Hardware Error]:   00000170: 00000000 00000000 00000000 
00000000  ................
[  404.111828] {4}[Hardware Error]:   00000180: 00000000 00000000 00000000 
00000000  ................
[  404.120788] {4}[Hardware Error]:   00000190: 00000000 00000000 00000000 
00000000  ................
[  404.129752] {4}[Hardware Error]:   000001a0: 00000000 00000000 00000000 
00000000  ................
[  404.138710] {4}[Hardware Error]:   000001b0: 00000000 00000000 00000000 
00000000  ................
[  404.147673] {4}[Hardware Error]:   000001c0: 00000000 00000000 00000000 
00000000  ................
[  404.156632] {4}[Hardware Error]:   000001d0: 00000000 00000000 00000000 
00000000  ................
[  404.165593] {4}[Hardware Error]:   000001e0: 00000000 00000000 00000000 
00000000  ................
[  404.174555] {4}[Hardware Error]:   000001f0: 00000000 00000000 00000000 
00000000  ................
[  404.183516] {4}[Hardware Error]:   00000200: 00000000 00000000 00000000 
00000000  ................
[  404.192476] {4}[Hardware Error]:   00000210: 00000000 00000000 00000000 
00000000  ................
[  404.201438] {4}[Hardware Error]:   00000220: 00000000 00000000 00000000 
00000000  ................
[  404.210398] {4}[Hardware Error]:   00000230: 00000000 00000000               
     ........
[  404.218665] Kernel panic - not syncing: Fatal hardware error!
[  404.224406] CPU: 0 PID: 217 Comm: kworker/0:1 Not tainted 4.10.0-29-generic 
#33~lp1706141+build.2-Ubuntu
[  404.233876] Hardware name: Qualcomm Qualcomm Centriq(TM) 2400 Development 
Platform/ABW|SYS|CVR,1DPC|V3           , BIOS XBL.DF.2.0.R1-00512 QDF2400_REL CR
[  404.247695] Workqueue: kacpi_notify acpi_os_execute_deferred
[  404.253347] Call trace:
[  404.255790] [<ffff1e8f9e08b078>] dump_backtrace+0x0/0x2b0
[  404.261182] [<ffff1e8f9e08b34c>] show_stack+0x24/0x30
[  404.266230] [<ffff1e8f9e4da5e0>] dump_stack+0x9c/0xbc
[  404.271276] [<ffff1e8f9e208620>] panic+0x140/0x2b0
[  404.276061] [<ffff1e8f9e5ef8e0>] ghes_proc+0x1d8/0x568
[  404.281191] [<ffff1e8f9e5efcb4>] ghes_notify_sci+0x44/0x70
[  404.286670] [<ffff1e8f9e0f6424>] notifier_call_chain+0x5c/0xa0
[  404.292495] [<ffff1e8f9e0f6970>] __blocking_notifier_call_chain+0x58/0xa0
[  404.299274] [<ffff1e8f9e0f69f4>] blocking_notifier_call_chain+0x3c/0x50
[  404.305883] [<ffff1e8f9e5ea09c>] acpi_hed_notify+0x24/0x30
[  404.311361] [<ffff1e8f9e5b1710>] acpi_device_notify+0x30/0x40
[  404.317101] [<ffff1e8f9e5c8204>] acpi_ev_notify_dispatch+0x4c/0x70
[  404.323274] [<ffff1e8f9e5ac2e4>] acpi_os_execute_deferred+0x24/0x38
[  404.329535] [<ffff1e8f9e0ed330>] process_one_work+0x158/0x478
[  404.335273] [<ffff1e8f9e0ed6a0>] worker_thread+0x50/0x4a8
[  404.340665] [<ffff1e8f9e0f47a8>] kthread+0x108/0x138
[  404.345622] [<ffff1e8f9e0838a0>] ret_from_fork+0x10/0x30
[  404.350934] SMP: stopping secondary CPUs
[  404.356117] Starting crashdump kernel...
[  404.360034] Bye!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1696570

Title:
  [SRU][Zesty] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Zesty:
  New

Bug description:
  [Impact]
  Adds UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64.

  [Test]
  Run mce-test for testing RAS features.

  [Fix]
  In maintainer (Will Deacon's) tree 
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/ras-apei

  [V17,01/11] acpi: apei: read ack upon ghes record consumption
  [V17,02/11] ras: acpi/apei: cper: add support for generic data v3 structure
  [V17,03/11] cper: add timestamp print to CPER status printing
  [V17,04/11] efi: parse ARM processor error
  [V17,05/11] arm64: exception: handle Synchronous External Abort
  [V17,06/11] acpi: apei: handle SEA notification type for ARMv8
  [V17,07/11] acpi: apei: panic OS with fatal error status block
  [V17,08/11] efi: print unrecognized CPER section
  [V17,09/11] ras: acpi / apei: generate trace event for unrecognized CPER 
section
  [V17,10/11] trace, ras: add ARM processor error trace event
  [V17,11/11] arm/arm64: KVM: add guest SEA support

  [Regression Potential]
  Patches deal with updates for RAS features on ARM64 with minor impact to 
generic code.Kernel was boot tested on ARM64, AMD64 and Power8 and no 
regressions were found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696570/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to