** Changed in: linux (Ubuntu) Status: In Progress => Fix Released
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1684054 Title: [LTCTest][Opal][FW860.20] HMI recoverable errors failed to recover and system goes to dump state. Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: Fix Released Status in linux source package in Zesty: New Bug description: == Comment: #0 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-17 06:08:41 == ---Problem Description--- HMI Recoverable error injection tests leads to system checkstop followed by system dump with ubuntu 17.04 os and kernel 4.10.0-19-generic ppc64le Contact Information = ppaid...@in.ibm.com ---uname output--- #21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux Machine Type = PowerNV 8284-22A ---System Hang--- System is in dumping state. after dump finishes system will IPL to OS again. ---Debugger--- A debugger is not configured == Comment: #3 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-17 06:12:51 == # uname -a #21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux # cat /etc/os-release NAME="Ubuntu" VERSION="17.04 (Zesty Zapus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 17.04" VERSION_ID="17.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=zesty UBUNTU_CODENAME=zesty root@p8wookie:~# == Comment: #4 - Kevin W. Rudd <ru...@us.ibm.com> - 2017-04-17 11:10:22 == == Comment: #5 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-17 13:34:03 == it looks like below commit is a culprit: ======================================= commit 2337d207288f163e10bd8d4d7eeb0c1c75046a0c Author: Nicholas Piggin <npig...@gmail.com> Date: Fri Jan 27 14:24:33 2017 +1000 powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts The branch from hmi_exception_early to hmi_exception_realmode must use a "relocatable-style" branch, because it is branching from unrelocated exception code to beyond __end_interrupts. Signed-off-by: Nicholas Piggin <npig...@gmail.com> Signed-off-by: Michael Ellerman <m...@ellerman.id.au> ======================================= With the above commit changes now hmi_exception_realmode() is called using bctrl which ends up messing up TOC (r2) value and further access using new r2 results into unpredictable behaviour. ---------------------------------------- c000000000025f50 <hmi_exception_realmode>: c000000000025f50: 3a 01 4c 3c addis r2,r12,314 c000000000025f54: b0 01 42 38 addi r2,r2,432 c000000000025f58: a6 02 08 7c mflr r0 ----------------------------------------- With above commit the hmi_exception_early() code jumps to c000000000025f50 (hmi_exception_realmode+0x0) which then sets up new value for r2. If we revert above commit the code jumps to c000000000025f58 (hmi_exception_realmode+0x8) and hmi handler works fine. After reverting above patch I don't see this issue anymore. I have rebuilt the ubuntu kernel after reverting above patch and you can find the kernel rpm at: Can you please retry your tests with above kernel and see if issue still persists. == Comment: #6 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-17 23:02:31 == Spoke to Michael Ellerman this morning. He helped me to identify the root cause and a fix patch beow: diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 857bf7c5b946..7cfeb8768587 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -982,7 +982,7 @@ TRAMP_REAL_BEGIN(hmi_exception_early) EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN) EXCEPTION_PROLOG_COMMON_3(0xe60) addi r3,r1,STACK_FRAME_OVERHEAD - BRANCH_LINK_TO_FAR(r4, hmi_exception_realmode) + BRANCH_LINK_TO_FAR(r12, hmi_exception_realmode) /* Windup the stack. */ /* Move original HSRR0 and HSRR1 into the respective regs */ ld r9,_MSR(r1) == Comment: #7 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 01:52:03 == == Comment: #8 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 01:53:57 == Hi Mahesh Tested all the HMI Recoverable errors on the below patched kernel, attached the corresponding executing logs. All tests are working fine. #21 SMP Mon Apr 17 12:58:30 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux Thanks == Comment: #9 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-18 06:07:56 == (In reply to comment #8) > Hi Mahesh > Tested all the HMI Recoverable errors on the below patched kernel, attached > the corresponding executing logs. All tests are working fine. > > Linux p8wookie 4.10.0-19.bz153487-generic #21 SMP Mon Apr 17 12:58:30 EDT > 2017 ppc64le ppc64le ppc64le GNU/Linux > > > Thanks Thanks. Michael has posted fix for this upstream. http://patchwork.ozlabs.org/patch/751647/ I will rebuild the new ubuntu kernel with above patch. == Comment: #12 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 09:27:59 == (In reply to comment #11) > > > > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032 > > I have built new kernel with above patch and you can find it below path > >:/home2/mahesh/u2/bz153487v2/linux-image-4.10.0-19.bz153487v2- > generic_4.10.0-19.bz153487v2.21_ppc64el.deb Tested with this new patched kernel, all tests are working fine. Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18 07:43:13 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux Will attach is full the execution logs here. == Comment: #13 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 09:29:43 == == Comment: #14 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-19 03:52:18 == (In reply to comment #12) > (In reply to comment #11) > > > > > > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032 > > Thanks for testing. We need to mirror this to ubuntu for fix patch inclusion > > Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18 07:43:13 EDT > 2017 ppc64le ppc64le ppc64le GNU/Linux > > Will attach is full the execution logs here. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1684054/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp