------- Comment From anjutsudha...@in.ibm.com 2018-05-02 05:19 EDT-------
(In reply to comment #39)
> Verified with test kernel : Issue is resolved.
> ===============
>
>
>
> But hitting another trace :
>
> [  294.764782] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  315.685952] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  317.385747] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  335.030061] hrtimer: interrupt took 1494725987 ns
> [  386.576484] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  403.964295] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  414.884012] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  431.700329] perf: Dynamic interrupt throttling disabled, can hang your
> system!
> [  471.108095] INFO: rcu_sched self-detected stall on CPU
> [  471.108214]        116-....: (5250 ticks this GP) 
> idle=c9a/140000000000002/0
> softirq=6343/6344 fqs=2625
> [  471.108351]         (t=5251 jiffies g=8835 c=8834 q=1160)
> [  471.108508] Task dump for CPU 116:
> [  471.108518] perf_fuzzer     R  running task        0  5428   5267
> 0x0004a006
> [  471.108549] Call Trace:
> [  471.108582] [c0002038e74231c0] [c000000000149e98]
> sched_show_task.part.16+0xd8/0x110 (unreliable)
> [  471.108627] [c0002038e7423230] [c0000000001a9e5c]
> rcu_dump_cpu_stacks+0xd4/0x138
> [  471.108664] [c0002038e7423280] [c0000000001a8f28]
> rcu_check_callbacks+0x8e8/0xb40
> [  471.108698] [c0002038e74233b0] [c0000000001b71c8]
> update_process_times+0x48/0x90
> [  471.108731] [c0002038e74233e0] [c0000000001cef14]
> tick_sched_handle.isra.5+0x34/0xd0
> [  471.108760] [c0002038e7423410] [c0000000001cf010]
> tick_sched_timer+0x60/0xe0
> [  471.108795] [c0002038e7423450] [c0000000001b7d74]
> __hrtimer_run_queues+0x144/0x370
> [  471.108830] [c0002038e74234d0] [c0000000001b8ccc]
> hrtimer_interrupt+0xfc/0x350
> [  471.108867] [c0002038e74235a0] [c0000000000248f0]
> __timer_interrupt+0x90/0x260
> [  471.108903] [c0002038e74235f0] [c000000000024d08]
> timer_interrupt+0x98/0xe0
> [  471.108943] [c0002038e7423620] [c00000000000b998]
> fast_exception_return+0x148/0x16c
> [  471.108990] --- interrupt: 901 at arch_local_irq_restore+0x84/0x90
>                    LR = __do_softirq+0xd8/0x3e4
> [  471.109017] [c0002038e7423910] [c0000000001b8d60]
> hrtimer_interrupt+0x190/0x350 (unreliable)
> [  471.109054] [c0002038e7423930] [c000000000cffbc8] __do_softirq+0xd8/0x3e4
> [  471.109089] [c0002038e7423a10] [c000000000115928] irq_exit+0xe8/0x120
> [  471.109124] [c0002038e7423a30] [c000000000024d0c]
> timer_interrupt+0x9c/0xe0
> [  471.109164] [c0002038e7423a60] [c00000000000b998]
> fast_exception_return+0x148/0x16c
> [  471.109211] --- interrupt: 901 at mutex_unlock+0x18/0x50
>                    LR = perf_event_for_each_child+0xb0/0xf0
> [  471.109236] [c0002038e7423d50] [c0000000002b9e70]
> perf_event_for_each_child+0x60/0xf0 (unreliable)
> [  471.109279] [c0002038e7423d90] [c0000000002c4da8]
> perf_event_task_enable+0x78/0xe0
> [  471.109309] [c0002038e7423dd0] [c00000000012d4e4] SyS_prctl+0x364/0x6a0
> [  471.109345] [c0002038e7423e30] [c00000000000b184] system_call+0x58/0x6c
> [  477.935937] watchdog: BUG: soft lockup - CPU#116 stuck for 23s!
> [perf_fuzzer:5428]
> [  477.936042] Modules linked in: xt_CHECKSUM(E) iptable_mangle(E)
> ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E)
> nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
> nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) bridge(E)
> stp(E) llc(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E)
> iptable_filter(E) kvm_hv(E) kvm(E) at24(E) ofpart(E) ipmi_powernv(E)
> ipmi_devintf(E) ipmi_msghandler(E) uio_pdrv_genirq(E) uio(E) cmdlinepart(E)
> powernv_flash(E) mtd(E) ibmpowernv(E) opal_prd(E) vmx_crypto(E)
> sch_fq_codel(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E)
> iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E)
> ip_tables(E) x_tables(E) autofs4(E) btrfs(E) zstd_compress(E) raid10(E)
> raid456(E) async_raid6_recov(E) async_memcpy(E)
> [  477.936571]  async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E)
> libcrc32c(E) raid1(E) raid0(E) multipath(E) linear(E) ast(E) mlx5_core(E)
> i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
> fb_sys_fops(E) ttm(E) crct10dif_vpmsum(E) ahci(E) mlxfw(E) crc32c_vpmsum(E)
> drm(E) tg3(E) libahci(E) devlink(E)
> [  477.936833] CPU: 116 PID: 5428 Comm: perf_fuzzer Tainted: G            E
> 4.15.0-20-generic #21
> [  477.936850] NIP:  c000000000016e84 LR: c000000000cffbc8 CTR:
> c000000000024480
> [  477.936870] REGS: c0002038e7423690 TRAP: 0901   Tainted: G            E
> (4.15.0-20-generic)
> [  477.936879] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> 48000244  XER: 20040000
> [  477.936970] CFAR: c000000000016e30 SOFTE: 1
>                GPR00: c000000000cffbc8 c0002038e7423910 c0000000016eae00
> 0000000000000001
>                GPR04: c000203994800400 0000000000000000 0000000001f3f3f9
> c0002038e7388a00
>                GPR08: 0000203993650000 b000000000001033 0000000000000008
> 0000000000000005
>                GPR12: c000000000024480 c000000007a6fc00
> [  477.937161] NIP [c000000000016e84] arch_local_irq_restore+0x84/0x90
> [  477.937183] LR [c000000000cffbc8] __do_softirq+0xd8/0x3e4
> [  477.937191] Call Trace:
> [  477.937221] [c0002038e7423910] [c0000000001b8d60]
> hrtimer_interrupt+0x190/0x350 (unreliable)
> [  477.937259] [c0002038e7423930] [c000000000cffbc8] __do_softirq+0xd8/0x3e4
> [  477.937295] [c0002038e7423a10] [c000000000115928] irq_exit+0xe8/0x120
> [  477.937331] [c0002038e7423a30] [c000000000024d0c]
> timer_interrupt+0x9c/0xe0
> [  477.937370] [c0002038e7423a60] [c00000000000b998]
> fast_exception_return+0x148/0x16c
> [  477.937416] --- interrupt: 901 at mutex_unlock+0x18/0x50
>                    LR = perf_event_for_each_child+0xb0/0xf0
> [  477.937440] [c0002038e7423d50] [c0000000002b9e70]
> perf_event_for_each_child+0x60/0xf0 (unreliable)
> [  477.937482] [c0002038e7423d90] [c0000000002c4da8]
> perf_event_task_enable+0x78/0xe0
> [  477.937512] [c0002038e7423dd0] [c00000000012d4e4] SyS_prctl+0x364/0x6a0
> [  477.937548] [c0002038e7423e30] [c00000000000b184] system_call+0x58/0x6c
> [  477.937564] Instruction dump:
> [  477.937585] 61298000 7d210164 38210020 e8010010 7c0803a6 4e800020
> 60420000 4bff3885
> [  477.937662] 60000000 4bffffe4 60420000 e92d0020 <7d210164> 4bffffac
> 60420000 3c4c016d
> [  529.938748] watchdog: BUG: soft lockup - CPU#77 stuck for 22s!
> [systemd-udevd:1476]
> [  529.938821] Modules linked in: xt_CHECKSUM(E) iptable_mangle(E)
> ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E)
> nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
> nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) bridge(E)
> stp(E) llc(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E)
> iptable_filter(E) kvm_hv(E) kvm(E) at24(E) ofpart(E) ipmi_powernv(E)
> ipmi_devintf(E) ipmi_msghandler(E) uio_pdrv_genirq(E) uio(E) cmdlinepart(E)
> powernv_flash(E) mtd(E) ibmpowernv(E) opal_prd(E) vmx_crypto(E)
> sch_fq_codel(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E)
> iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E)
> ip_tables(E) x_tables(E) autofs4(E) btrfs(E) zstd_compress(E) raid10(E)
> raid456(E) async_raid6_recov(E) async_memcpy(E)
> [  529.938868]  async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E)
> libcrc32c(E) raid1(E) raid0(E) multipath(E) linear(E) ast(E) mlx5_core(E)
> i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
> fb_sys_fops(E) ttm(E) crct10dif_vpmsum(E) ahci(E) mlxfw(E) crc32c_vpmsum(E)
> drm(E) tg3(E) libahci(E) devlink(E)
> [  529.938890] CPU: 77 PID: 1476 Comm: systemd-udevd Tainted: G
> EL   4.15.0-20-generic #21
> [  529.938893] NIP:  c0000000001d6784 LR: c0000000001d672c CTR:
> c000000000044510
> [  529.938895] REGS: c00020397068f840 TRAP: 0901   Tainted: G            EL
> (4.15.0-20-generic)
> [  529.938896] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR:
> 44044824  XER: 00000000
> [  529.938906] CFAR: c0000000001d6790 SOFTE: 1
>                GPR00: c0000000001d672c c00020397068fac0 c0000000016eae00
> 0000000000000074
>                GPR04: 0000000000000074 0000000000000074 0000000000000000
> c00000000171dd78
>                GPR08: 0000000000000074 0000000000000001 c00020399482c580
> c000203993e69800
>                GPR12: c000000000044510 c000000007a54f00
> [  529.938922] NIP [c0000000001d6784] smp_call_function_many+0x3b4/0x450
> [  529.938924] LR [c0000000001d672c] smp_call_function_many+0x35c/0x450
> [  529.938925] Call Trace:
> [  529.938927] [c00020397068fac0] [c0000000001d672c]
> smp_call_function_many+0x35c/0x450 (unreliable)
> [  529.938932] [c00020397068fb30] [c000000000076108]
> serialize_against_pte_lookup+0x38/0x50
> [  529.938936] [c00020397068fb50] [c000000000078030]
> radix__pmdp_huge_get_and_clear+0x60/0x80
> [  529.938939] [c00020397068fb80] [c00000000034e91c]
> pmdp_huge_clear_flush+0x3c/0xb0
> [  529.938944] [c00020397068fbc0] [c0000000003a2970]
> do_huge_pmd_wp_page+0x670/0x1100
> [  529.938946] [c00020397068fc40] [c00000000033ea98]
> __handle_mm_fault+0xb98/0xe10
> [  529.938949] [c00020397068fd20] [c00000000033ee38]
> handle_mm_fault+0x128/0x210
> [  529.938951] [c00020397068fd60] [c00000000006a24c]
> __do_page_fault+0x21c/0xa70
> [  529.938955] [c00020397068fe30] [c00000000000a544]
> handle_page_fault+0x18/0x38
> [  529.938955] Instruction dump:
> [  529.938957] 7d4a4a14 812a0018 71290001 41820034 4800001c 60000000
> 60000000 60000000
> [  529.938962] 60000000 60000000 60420000 7c210b78 <7c421378> 812a0018
> 71290001 4082fff0

Hi Shriya,

So the call-trace is not incaes of
https://bugzilla.linux.ibm.com/show_bug.cgi?id=162160  ?

And I hope the counts for thread-imc is also verified.

Thanks,
Anju

** Bug watch added: bugzilla.linux.ibm.com/ #162160
   https://bugzilla.linux.ibm.com/show_bug.cgi?id=162160

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1752002

Title:
  [P9,POwer NV][WSP][DD2.1][Ubuntu 1804][Perf fuzzer] : Call trace is
  seen while running perf fuzzer (perf:)

Status in The Ubuntu-power-systems project:
  Incomplete
Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Triaged

Bug description:
  == Comment: #0 - Shriya R. Kulkarni <shriy...@in.ibm.com> - 2018-02-02 
01:21:36 ==
  Problem Description :
  =============

  Warn on message is seen while running perf fuzzer tests.

  Machine details :
  ==========
  Hardware : Witherspoon (wsp12) + DD2.1
  OS : Ubuntu 1804
  uname -a : 4.13.0-32-generic #35~lp1746225 ( Kernel from the bug : 
https://bugzilla.linux.ibm.com/show_bug.cgi?id=164107#c7 )

  
  Steps to reproduce :
  ============
  Build Kernel :
  --------------------
  To avoid the kernel crash due to Perf fuzzer , use the kernel mentioned in 
the link : https://bugzilla.linux.ibm.com/show_bug.cgi?id=164107#c7

  #! /bin/bash
  set -x
  git clone https://github.com/deater/perf_event_tests.git
  cd perf_event_tests/include
  mkdir asm
  cd asm
  wget http://9.114.13.132/repo/shriya/perf_regs.h
  cd ../../lib
  make
  sleep 10
  cd ../fuzzer
  make
  sleep 10

  echo 0 > /proc/sys/kernel/nmi_watchdog
  echo 2 > /proc/sys/kernel/perf_event_paranoid
  echo 100000 > /proc/sys/kernel/perf_event_max_sample_rate
  ./perf_fuzzer -r 1492143527

  
  Call trace :
  =======
  [  329.228031] ------------[ cut here ]------------
  [  329.228039] WARNING: CPU: 43 PID: 9088 at 
/home/jsalisbury/bugs/lp1746225/ubuntu-artful/kernel/events/core.c:3038 
perf_pmu_sched_task+0x170/0x180
  [  329.228040] Modules linked in: ofpart at24 uio_pdrv_genirq uio cmdlinepart 
powernv_flash mtd ipmi_powernv vmx_crypto ipmi_devintf ipmi_msghandler 
ibmpowernv opal_prd crct10dif_vpmsum sch_fq_codel ip_tables x_tables autofs4 
crc32c_vpmsum lpfc ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect 
sysimgblt mlx5_core fb_sys_fops ttm tg3 nvmet_fc drm ahci nvmet nvme_fc libahci 
nvme_fabrics mlxfw nvme_core devlink scsi_transport_fc
  [  329.228068] CPU: 43 PID: 9088 Comm: perf_fuzzer Not tainted 
4.13.0-32-generic #35~lp1746225
  [  329.228070] task: c000003f776ac900 task.stack: c000003f77728000
  [  329.228071] NIP: c000000000299b70 LR: c0000000002a4534 CTR: 
c00000000029bb80
  [  329.228073] REGS: c000003f7772b760 TRAP: 0700   Not tainted  
(4.13.0-32-generic)
  [  329.228073] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
  [  329.228079]   CR: 24008822  XER: 00000000
  [  329.228080] CFAR: c000000000299a70 SOFTE: 0 
                 GPR00: c0000000002a4534 c000003f7772b9e0 c000000001606200 
c000003fef858908 
                 GPR04: c000003f776ac900 0000000000000001 ffffffffffffffff 
0000003fee730000 
                 GPR08: 0000000000000000 0000000000000000 c0000000011220d8 
0000000000000002 
                 GPR12: c00000000029bb80 c000000007a3d900 0000000000000000 
0000000000000000 
                 GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000 
                 GPR20: 0000000000000000 0000000000000000 c000003f776ad090 
c000000000c71354 
                 GPR24: c000003fef716780 0000003fee730000 c000003fe69d4200 
c000003f776ad330 
                 GPR28: c0000000011220d8 0000000000000001 c0000000014c6108 
c000003fef858900 
  [  329.228098] NIP [c000000000299b70] perf_pmu_sched_task+0x170/0x180
  [  329.228100] LR [c0000000002a4534] __perf_event_task_sched_in+0xc4/0x230
  [  329.228101] Call Trace:
  [  329.228102] [c000003f7772b9e0] [c0000000002a0678] 
perf_iterate_sb+0x158/0x2a0 (unreliable)
  [  329.228105] [c000003f7772ba30] [c0000000002a4534] 
__perf_event_task_sched_in+0xc4/0x230
  [  329.228107] [c000003f7772bab0] [c0000000001396dc] 
finish_task_switch+0x21c/0x310
  [  329.228109] [c000003f7772bb60] [c000000000c71354] __schedule+0x304/0xb80
  [  329.228111] [c000003f7772bc40] [c000000000c71c10] schedule+0x40/0xc0
  [  329.228113] [c000003f7772bc60] [c0000000001033f4] do_wait+0x254/0x2e0
  [  329.228115] [c000003f7772bcd0] [c000000000104ac0] kernel_wait4+0xa0/0x1a0
  [  329.228117] [c000003f7772bd70] [c000000000104c24] SyS_wait4+0x64/0xc0
  [  329.228121] [c000003f7772be30] [c00000000000b184] system_call+0x58/0x6c
  [  329.228121] Instruction dump:
  [  329.228123] 3beafea0 7faa4800 409eff18 e8010060 eb610028 ebc10040 7c0803a6 
38210050 
  [  329.228127] eb81ffe0 eba1ffe8 ebe1fff8 4e800020 <0fe00000> 4bffffbc 
60000000 60420000 
  [  329.228131] ---[ end trace 8c46856d314c1811 ]---
  [  375.755943] hrtimer: interrupt took 31601 ns

  == Comment: #4 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-05
  06:34:09 ==

  
  == Comment: #5 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-05 
06:36:12 ==
  We have similar issue reported on different distro where Anju Provided the 
patch. Patch attached above. 
  . 
  Will check with her if that patch got accepted upstream

  == Comment: #14 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2018-02-23
  01:52:50 ==

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1752002/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to