------- Comment From anjutsudha...@in.ibm.com 2018-05-02 05:19 EDT------- (In reply to comment #39) > Verified with test kernel : Issue is resolved. > =============== > > > > But hitting another trace : > > [ 294.764782] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 315.685952] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 317.385747] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 335.030061] hrtimer: interrupt took 1494725987 ns > [ 386.576484] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 403.964295] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 414.884012] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 431.700329] perf: Dynamic interrupt throttling disabled, can hang your > system! > [ 471.108095] INFO: rcu_sched self-detected stall on CPU > [ 471.108214] 116-....: (5250 ticks this GP) > idle=c9a/140000000000002/0 > softirq=6343/6344 fqs=2625 > [ 471.108351] (t=5251 jiffies g=8835 c=8834 q=1160) > [ 471.108508] Task dump for CPU 116: > [ 471.108518] perf_fuzzer R running task 0 5428 5267 > 0x0004a006 > [ 471.108549] Call Trace: > [ 471.108582] [c0002038e74231c0] [c000000000149e98] > sched_show_task.part.16+0xd8/0x110 (unreliable) > [ 471.108627] [c0002038e7423230] [c0000000001a9e5c] > rcu_dump_cpu_stacks+0xd4/0x138 > [ 471.108664] [c0002038e7423280] [c0000000001a8f28] > rcu_check_callbacks+0x8e8/0xb40 > [ 471.108698] [c0002038e74233b0] [c0000000001b71c8] > update_process_times+0x48/0x90 > [ 471.108731] [c0002038e74233e0] [c0000000001cef14] > tick_sched_handle.isra.5+0x34/0xd0 > [ 471.108760] [c0002038e7423410] [c0000000001cf010] > tick_sched_timer+0x60/0xe0 > [ 471.108795] [c0002038e7423450] [c0000000001b7d74] > __hrtimer_run_queues+0x144/0x370 > [ 471.108830] [c0002038e74234d0] [c0000000001b8ccc] > hrtimer_interrupt+0xfc/0x350 > [ 471.108867] [c0002038e74235a0] [c0000000000248f0] > __timer_interrupt+0x90/0x260 > [ 471.108903] [c0002038e74235f0] [c000000000024d08] > timer_interrupt+0x98/0xe0 > [ 471.108943] [c0002038e7423620] [c00000000000b998] > fast_exception_return+0x148/0x16c > [ 471.108990] --- interrupt: 901 at arch_local_irq_restore+0x84/0x90 > LR = __do_softirq+0xd8/0x3e4 > [ 471.109017] [c0002038e7423910] [c0000000001b8d60] > hrtimer_interrupt+0x190/0x350 (unreliable) > [ 471.109054] [c0002038e7423930] [c000000000cffbc8] __do_softirq+0xd8/0x3e4 > [ 471.109089] [c0002038e7423a10] [c000000000115928] irq_exit+0xe8/0x120 > [ 471.109124] [c0002038e7423a30] [c000000000024d0c] > timer_interrupt+0x9c/0xe0 > [ 471.109164] [c0002038e7423a60] [c00000000000b998] > fast_exception_return+0x148/0x16c > [ 471.109211] --- interrupt: 901 at mutex_unlock+0x18/0x50 > LR = perf_event_for_each_child+0xb0/0xf0 > [ 471.109236] [c0002038e7423d50] [c0000000002b9e70] > perf_event_for_each_child+0x60/0xf0 (unreliable) > [ 471.109279] [c0002038e7423d90] [c0000000002c4da8] > perf_event_task_enable+0x78/0xe0 > [ 471.109309] [c0002038e7423dd0] [c00000000012d4e4] SyS_prctl+0x364/0x6a0 > [ 471.109345] [c0002038e7423e30] [c00000000000b184] system_call+0x58/0x6c > [ 477.935937] watchdog: BUG: soft lockup - CPU#116 stuck for 23s! > [perf_fuzzer:5428] > [ 477.936042] Modules linked in: xt_CHECKSUM(E) iptable_mangle(E) > ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) > nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) > nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) bridge(E) > stp(E) llc(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) > iptable_filter(E) kvm_hv(E) kvm(E) at24(E) ofpart(E) ipmi_powernv(E) > ipmi_devintf(E) ipmi_msghandler(E) uio_pdrv_genirq(E) uio(E) cmdlinepart(E) > powernv_flash(E) mtd(E) ibmpowernv(E) opal_prd(E) vmx_crypto(E) > sch_fq_codel(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) > iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) > ip_tables(E) x_tables(E) autofs4(E) btrfs(E) zstd_compress(E) raid10(E) > raid456(E) async_raid6_recov(E) async_memcpy(E) > [ 477.936571] async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) > libcrc32c(E) raid1(E) raid0(E) multipath(E) linear(E) ast(E) mlx5_core(E) > i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) > fb_sys_fops(E) ttm(E) crct10dif_vpmsum(E) ahci(E) mlxfw(E) crc32c_vpmsum(E) > drm(E) tg3(E) libahci(E) devlink(E) > [ 477.936833] CPU: 116 PID: 5428 Comm: perf_fuzzer Tainted: G E > 4.15.0-20-generic #21 > [ 477.936850] NIP: c000000000016e84 LR: c000000000cffbc8 CTR: > c000000000024480 > [ 477.936870] REGS: c0002038e7423690 TRAP: 0901 Tainted: G E > (4.15.0-20-generic) > [ 477.936879] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > 48000244 XER: 20040000 > [ 477.936970] CFAR: c000000000016e30 SOFTE: 1 > GPR00: c000000000cffbc8 c0002038e7423910 c0000000016eae00 > 0000000000000001 > GPR04: c000203994800400 0000000000000000 0000000001f3f3f9 > c0002038e7388a00 > GPR08: 0000203993650000 b000000000001033 0000000000000008 > 0000000000000005 > GPR12: c000000000024480 c000000007a6fc00 > [ 477.937161] NIP [c000000000016e84] arch_local_irq_restore+0x84/0x90 > [ 477.937183] LR [c000000000cffbc8] __do_softirq+0xd8/0x3e4 > [ 477.937191] Call Trace: > [ 477.937221] [c0002038e7423910] [c0000000001b8d60] > hrtimer_interrupt+0x190/0x350 (unreliable) > [ 477.937259] [c0002038e7423930] [c000000000cffbc8] __do_softirq+0xd8/0x3e4 > [ 477.937295] [c0002038e7423a10] [c000000000115928] irq_exit+0xe8/0x120 > [ 477.937331] [c0002038e7423a30] [c000000000024d0c] > timer_interrupt+0x9c/0xe0 > [ 477.937370] [c0002038e7423a60] [c00000000000b998] > fast_exception_return+0x148/0x16c > [ 477.937416] --- interrupt: 901 at mutex_unlock+0x18/0x50 > LR = perf_event_for_each_child+0xb0/0xf0 > [ 477.937440] [c0002038e7423d50] [c0000000002b9e70] > perf_event_for_each_child+0x60/0xf0 (unreliable) > [ 477.937482] [c0002038e7423d90] [c0000000002c4da8] > perf_event_task_enable+0x78/0xe0 > [ 477.937512] [c0002038e7423dd0] [c00000000012d4e4] SyS_prctl+0x364/0x6a0 > [ 477.937548] [c0002038e7423e30] [c00000000000b184] system_call+0x58/0x6c > [ 477.937564] Instruction dump: > [ 477.937585] 61298000 7d210164 38210020 e8010010 7c0803a6 4e800020 > 60420000 4bff3885 > [ 477.937662] 60000000 4bffffe4 60420000 e92d0020 <7d210164> 4bffffac > 60420000 3c4c016d > [ 529.938748] watchdog: BUG: soft lockup - CPU#77 stuck for 22s! > [systemd-udevd:1476] > [ 529.938821] Modules linked in: xt_CHECKSUM(E) iptable_mangle(E) > ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) > nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) > nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) bridge(E) > stp(E) llc(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) > iptable_filter(E) kvm_hv(E) kvm(E) at24(E) ofpart(E) ipmi_powernv(E) > ipmi_devintf(E) ipmi_msghandler(E) uio_pdrv_genirq(E) uio(E) cmdlinepart(E) > powernv_flash(E) mtd(E) ibmpowernv(E) opal_prd(E) vmx_crypto(E) > sch_fq_codel(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) > iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) > ip_tables(E) x_tables(E) autofs4(E) btrfs(E) zstd_compress(E) raid10(E) > raid456(E) async_raid6_recov(E) async_memcpy(E) > [ 529.938868] async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) > libcrc32c(E) raid1(E) raid0(E) multipath(E) linear(E) ast(E) mlx5_core(E) > i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) > fb_sys_fops(E) ttm(E) crct10dif_vpmsum(E) ahci(E) mlxfw(E) crc32c_vpmsum(E) > drm(E) tg3(E) libahci(E) devlink(E) > [ 529.938890] CPU: 77 PID: 1476 Comm: systemd-udevd Tainted: G > EL 4.15.0-20-generic #21 > [ 529.938893] NIP: c0000000001d6784 LR: c0000000001d672c CTR: > c000000000044510 > [ 529.938895] REGS: c00020397068f840 TRAP: 0901 Tainted: G EL > (4.15.0-20-generic) > [ 529.938896] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: > 44044824 XER: 00000000 > [ 529.938906] CFAR: c0000000001d6790 SOFTE: 1 > GPR00: c0000000001d672c c00020397068fac0 c0000000016eae00 > 0000000000000074 > GPR04: 0000000000000074 0000000000000074 0000000000000000 > c00000000171dd78 > GPR08: 0000000000000074 0000000000000001 c00020399482c580 > c000203993e69800 > GPR12: c000000000044510 c000000007a54f00 > [ 529.938922] NIP [c0000000001d6784] smp_call_function_many+0x3b4/0x450 > [ 529.938924] LR [c0000000001d672c] smp_call_function_many+0x35c/0x450 > [ 529.938925] Call Trace: > [ 529.938927] [c00020397068fac0] [c0000000001d672c] > smp_call_function_many+0x35c/0x450 (unreliable) > [ 529.938932] [c00020397068fb30] [c000000000076108] > serialize_against_pte_lookup+0x38/0x50 > [ 529.938936] [c00020397068fb50] [c000000000078030] > radix__pmdp_huge_get_and_clear+0x60/0x80 > [ 529.938939] [c00020397068fb80] [c00000000034e91c] > pmdp_huge_clear_flush+0x3c/0xb0 > [ 529.938944] [c00020397068fbc0] [c0000000003a2970] > do_huge_pmd_wp_page+0x670/0x1100 > [ 529.938946] [c00020397068fc40] [c00000000033ea98] > __handle_mm_fault+0xb98/0xe10 > [ 529.938949] [c00020397068fd20] [c00000000033ee38] > handle_mm_fault+0x128/0x210 > [ 529.938951] [c00020397068fd60] [c00000000006a24c] > __do_page_fault+0x21c/0xa70 > [ 529.938955] [c00020397068fe30] [c00000000000a544] > handle_page_fault+0x18/0x38 > [ 529.938955] Instruction dump: > [ 529.938957] 7d4a4a14 812a0018 71290001 41820034 4800001c 60000000 > 60000000 60000000 > [ 529.938962] 60000000 60000000 60420000 7c210b78 <7c421378> 812a0018 > 71290001 4082fff0
Hi Shriya, So the call-trace is not incaes of https://bugzilla.linux.ibm.com/show_bug.cgi?id=162160 ? And I hope the counts for thread-imc is also verified. Thanks, Anju ** Bug watch added: bugzilla.linux.ibm.com/ #162160 https://bugzilla.linux.ibm.com/show_bug.cgi?id=162160 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1752002 Title: [P9,POwer NV][WSP][DD2.1][Ubuntu 1804][Perf fuzzer] : Call trace is seen while running perf fuzzer (perf:) Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Shriya R. Kulkarni <shriy...@in.ibm.com> - 2018-02-02 01:21:36 == Problem Description : ============= Warn on message is seen while running perf fuzzer tests. Machine details : ========== Hardware : Witherspoon (wsp12) + DD2.1 OS : Ubuntu 1804 uname -a : 4.13.0-32-generic #35~lp1746225 ( Kernel from the bug : https://bugzilla.linux.ibm.com/show_bug.cgi?id=164107#c7 ) Steps to reproduce : ============ Build Kernel : -------------------- To avoid the kernel crash due to Perf fuzzer , use the kernel mentioned in the link : https://bugzilla.linux.ibm.com/show_bug.cgi?id=164107#c7 #! /bin/bash set -x git clone https://github.com/deater/perf_event_tests.git cd perf_event_tests/include mkdir asm cd asm wget http://9.114.13.132/repo/shriya/perf_regs.h cd ../../lib make sleep 10 cd ../fuzzer make sleep 10 echo 0 > /proc/sys/kernel/nmi_watchdog echo 2 > /proc/sys/kernel/perf_event_paranoid echo 100000 > /proc/sys/kernel/perf_event_max_sample_rate ./perf_fuzzer -r 1492143527 Call trace : ======= [ 329.228031] ------------[ cut here ]------------ [ 329.228039] WARNING: CPU: 43 PID: 9088 at /home/jsalisbury/bugs/lp1746225/ubuntu-artful/kernel/events/core.c:3038 perf_pmu_sched_task+0x170/0x180 [ 329.228040] Modules linked in: ofpart at24 uio_pdrv_genirq uio cmdlinepart powernv_flash mtd ipmi_powernv vmx_crypto ipmi_devintf ipmi_msghandler ibmpowernv opal_prd crct10dif_vpmsum sch_fq_codel ip_tables x_tables autofs4 crc32c_vpmsum lpfc ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt mlx5_core fb_sys_fops ttm tg3 nvmet_fc drm ahci nvmet nvme_fc libahci nvme_fabrics mlxfw nvme_core devlink scsi_transport_fc [ 329.228068] CPU: 43 PID: 9088 Comm: perf_fuzzer Not tainted 4.13.0-32-generic #35~lp1746225 [ 329.228070] task: c000003f776ac900 task.stack: c000003f77728000 [ 329.228071] NIP: c000000000299b70 LR: c0000000002a4534 CTR: c00000000029bb80 [ 329.228073] REGS: c000003f7772b760 TRAP: 0700 Not tainted (4.13.0-32-generic) [ 329.228073] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> [ 329.228079] CR: 24008822 XER: 00000000 [ 329.228080] CFAR: c000000000299a70 SOFTE: 0 GPR00: c0000000002a4534 c000003f7772b9e0 c000000001606200 c000003fef858908 GPR04: c000003f776ac900 0000000000000001 ffffffffffffffff 0000003fee730000 GPR08: 0000000000000000 0000000000000000 c0000000011220d8 0000000000000002 GPR12: c00000000029bb80 c000000007a3d900 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c000003f776ad090 c000000000c71354 GPR24: c000003fef716780 0000003fee730000 c000003fe69d4200 c000003f776ad330 GPR28: c0000000011220d8 0000000000000001 c0000000014c6108 c000003fef858900 [ 329.228098] NIP [c000000000299b70] perf_pmu_sched_task+0x170/0x180 [ 329.228100] LR [c0000000002a4534] __perf_event_task_sched_in+0xc4/0x230 [ 329.228101] Call Trace: [ 329.228102] [c000003f7772b9e0] [c0000000002a0678] perf_iterate_sb+0x158/0x2a0 (unreliable) [ 329.228105] [c000003f7772ba30] [c0000000002a4534] __perf_event_task_sched_in+0xc4/0x230 [ 329.228107] [c000003f7772bab0] [c0000000001396dc] finish_task_switch+0x21c/0x310 [ 329.228109] [c000003f7772bb60] [c000000000c71354] __schedule+0x304/0xb80 [ 329.228111] [c000003f7772bc40] [c000000000c71c10] schedule+0x40/0xc0 [ 329.228113] [c000003f7772bc60] [c0000000001033f4] do_wait+0x254/0x2e0 [ 329.228115] [c000003f7772bcd0] [c000000000104ac0] kernel_wait4+0xa0/0x1a0 [ 329.228117] [c000003f7772bd70] [c000000000104c24] SyS_wait4+0x64/0xc0 [ 329.228121] [c000003f7772be30] [c00000000000b184] system_call+0x58/0x6c [ 329.228121] Instruction dump: [ 329.228123] 3beafea0 7faa4800 409eff18 e8010060 eb610028 ebc10040 7c0803a6 38210050 [ 329.228127] eb81ffe0 eba1ffe8 ebe1fff8 4e800020 <0fe00000> 4bffffbc 60000000 60420000 [ 329.228131] ---[ end trace 8c46856d314c1811 ]--- [ 375.755943] hrtimer: interrupt took 31601 ns == Comment: #4 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-05 06:34:09 == == Comment: #5 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-05 06:36:12 == We have similar issue reported on different distro where Anju Provided the patch. Patch attached above. . Will check with her if that patch got accepted upstream == Comment: #14 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-23 01:52:50 == To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1752002/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp