** Changed in: linux (Ubuntu) Status: New => Triaged ** Changed in: linux (Ubuntu) Importance: Undecided => High
** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Bionic) Status: New => Triaged ** Changed in: linux (Ubuntu Bionic) Importance: Undecided => High ** Changed in: linux (Ubuntu Bionic) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) ** Changed in: linux (Ubuntu) Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => Joseph Salisbury (jsalisbury) ** Changed in: linux (Ubuntu) Status: Triaged => In Progress ** Changed in: linux (Ubuntu Bionic) Status: Triaged => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1777857 Title: [LTCTest][OPAL][OP920] INFO: rcu_sched self-detected stall on CPU Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: In Progress Bug description: == Comment: #0 - PAVAMAN SUBRAMANIYAM <pavsu...@in.ibm.com> - 2018-05-23 01:15:30 == Install a P9 Open Power Hardware with the latest OP920 Firmware images provided in the following link: http://pfd.austin.ibm.com/releasenotes/openpower9/OP920/OP920_1808A/OP920_1808N_RelNote_Main.html root@witherspoon:~# cat /etc/os-release ID="openbmc-phosphor" NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)" VERSION="ibm-v2.1" VERSION_ID="ibm-v2.1-438-g0030304-r12-0-g5ee4fb0" PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) ibm-v2.1" BUILD_ID="ibm-v2.1-438-g0030304-r12" root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION IBM-witherspoon-ibm-OP9-v2.0-2.14 op-build-v2.0-11-gb248194-dirty buildroot-2018.02.1-6-ga8d1126 skiboot-v6.0.1 hostboot-8ab6717d-pfc036fa occ-77bb5e6 linux-4.16.8-openpower2-pb532d68 petitboot-v1.7.1-p1188545 machine-xml-7cd20a6 hostboot-binaries-276bb70 capp-ucode-p9-dd2-v4 sbe-a596975 hcode-b8173e8 Seeing the following messages in the dmesg logs. [ 16.377405] ipmi_si: Unable to find any System Interface(s) [ 17.384118] nf_conntrack version 0.5.0 (65536 buckets, 262144 max) [ 1372.711730] INFO: rcu_sched self-detected stall on CPU [ 1372.711787] 32-....: (5249 ticks this GP) idle=182/140000000000001/0 softirq=1093/1093 fqs=2623 [ 1372.711863] (t=5250 jiffies g=22430 c=22429 q=953) [ 1372.711921] Task dump for CPU 32: [ 1372.711922] kworker/32:1 R running task 0 1123 2 0x00000804 [ 1372.711930] Workqueue: events rtc_timer_do_work [ 1372.711931] Call Trace: [ 1372.711934] [c000003fd2b97350] [c00000000014a8f8] sched_show_task.part.16+0xd8/0x110 (unreliable) [ 1372.711939] [c000003fd2b973c0] [c0000000001aa8bc] rcu_dump_cpu_stacks+0xd4/0x138 [ 1372.711942] [c000003fd2b97410] [c0000000001a9988] rcu_check_callbacks+0x8e8/0xb40 [ 1372.711945] [c000003fd2b97540] [c0000000001b7c28] update_process_times+0x48/0x90 [ 1372.711948] [c000003fd2b97570] [c0000000001cf974] tick_sched_handle.isra.5+0x34/0xd0 [ 1372.711950] [c000003fd2b975a0] [c0000000001cfa70] tick_sched_timer+0x60/0xe0 [ 1372.711953] [c000003fd2b975e0] [c0000000001b87d4] __hrtimer_run_queues+0x144/0x370 [ 1372.711956] [c000003fd2b97660] [c0000000001b972c] hrtimer_interrupt+0xfc/0x350 [ 1372.711959] [c000003fd2b97730] [c0000000000249f0] __timer_interrupt+0x90/0x260 [ 1372.711962] [c000003fd2b97780] [c000000000024e08] timer_interrupt+0x98/0xe0 [ 1372.711965] [c000003fd2b977b0] [c000000000009054] decrementer_common+0x114/0x120 [ 1372.711970] --- interrupt: 901 at opal_get_rtc_time+0x98/0x110 LR = opal_return+0x14/0x48 [ 1372.711972] [c000003fd2b97aa0] [c000000000a457b8] opal_get_rtc_time+0x98/0x110 (unreliable) [ 1372.711975] [c000003fd2b97ae0] [c000000000a3f98c] __rtc_read_time+0x7c/0x180 [ 1372.711977] [c000003fd2b97b60] [c000000000a41738] rtc_timer_do_work+0x78/0x250 [ 1372.711980] [c000003fd2b97c90] [c000000000134378] process_one_work+0x298/0x5a0 [ 1372.711982] [c000003fd2b97d20] [c000000000134718] worker_thread+0x98/0x630 [ 1372.711985] [c000003fd2b97dc0] [c00000000013d348] kthread+0x1a8/0x1b0 [ 1372.711988] [c000003fd2b97e30] [c00000000000b658] ret_from_kernel_thread+0x5c/0x84 == Comment: #1 - PAVAMAN SUBRAMANIYAM <pavsu...@in.ibm.com> - 2018-05-23 01:31:06 == == Comment: #2 - Application Cdeadmin <cdead...@us.ibm.com> - 2018-05-23 01:33:40 == cde00 (cdead...@us.ibm.com) added native attachment /tmp/AIXOS07311082/dmesg.txt on 2018-05-23 01:33:33 == Comment: #3 - Application Cdeadmin <cdead...@us.ibm.com> - 2018-05-24 16:45:41 == ==== State: Open by: jayeshp on 24 May 2018 16:42:57 ==== #=#=# 2018-05-24 16:42:54 (CDT) #=#=# New Fix_Potential = [P920.10W] #=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=# == Comment: #4 - Stewart Smith <sesm...@au1.ibm.com> - 2018-05-30 21:15:15 == This'll be a missing backport of some kernel fixes in the RTC driver. It's at least this commit: commit 682e6b4da5cbe8e9a53f979a58c2a9d7dc997175 Author: Nicholas Piggin <npig...@gmail.com> Date: Tue Apr 10 21:49:32 2018 +1000 rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or OPAL_BUSY_EVENT from firmware, which causes large scheduling latencies, up to 50 seconds have been observed here when RTC stops responding (BMC reboot can do it). Fix this by converting it to the standard form OPAL_BUSY loop that sleeps. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: sta...@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npig...@gmail.com> Acked-by: Alexandre Belloni <alexandre.bell...@bootlin.com> Signed-off-by: Michael Ellerman <m...@ellerman.id.au> To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1777857/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp