Hi Bowen Li any chance you can move to latest stable and check if you are still able to reproduce it?
Thank you Alfredo > On 30 Oct 2017, at 08:15, Bowen Li <[email protected]> wrote: > > Hi all, > Recently,I run bro cluster with PF_RING 6.4.1 as the package capturing > framework. However, I found a strange scene: sometimes when the cluster > restarts, the kernel module of PF_RING will block. The problem seems to > generate when operates /proc to apply read-write lock, causing kernel CPU > soft lockup and then the whole server is stuck. I want to know whether the > deadlock is caused by the resource competition when the cluster progress > applies read-write lock at the same time, or other reasons. > Here is the log in /var/log/message: > > Oct 22 01:26:10 TEST kernel: BUG: soft lockup - CPU#16 stuck for 22s! > [bro:17375] > Oct 22 01:26:10 TEST kernel: Modules linked in: binfmt_misc xt_CHECKSUM > iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 > nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT > tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables > iptable_filter intel_powerclamp coretemp intel_rapl kvm_intel iTCO_wdt kvm > mei_me iTCO_vendor_support mxm_wmi mei lpc_ich ipmi_ssif sb_edac crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper > edac_core ipmi_si cryptd ioTESTma ipmi_msghandler shpchp wmi sg dca i2c_i801 > pcspkr mfd_core nfsd auth_rpcgss nfs_acl lockd grace pf_ring(OE) sunrpc > ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 > syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm > crct10dif_pclmul crct10dif_common > Oct 22 01:26:10 TEST kernel: crc32c_intel drm isci e1000e libsas ahci > scsi_transport_sas libahci ptp i2c_core libata pps_core ntb > Oct 22 01:26:10 TEST kernel: CPU: 16 PID: 17375 Comm: bro Tainted: G > OE ------------ 3.10.0-327.13.1.el7.x86_64 #1 > Oct 22 01:26:10 TEST kernel: Hardware name: Supermicro > X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015 > Oct 22 01:26:10 TEST kernel: task: ffff880fdad73980 ti: ffff8800287b8000 > task.ti: ffff8800287b8000 > Oct 22 01:26:10 TEST kernel: RIP: 0010:[<ffffffff81301909>] > [<ffffffff81301909>] __write_lock_failed+0x9/0x20 > Oct 22 01:26:10 TEST kernel: RSP: 0018:ffff8800287bbe88 EFLAGS: 00000297 > Oct 22 01:26:10 TEST kernel: RAX: 0000000000000000 RBX: 0000000092e79345 RCX: > 0000000000000000 > Oct 22 01:26:10 TEST kernel: RDX: 0000000000000000 RSI: ffff88104aabb000 RDI: > ffffffffa03c6324 > Oct 22 01:26:10 TEST kernel: RBP: ffff8800287bbe88 R08: 0000000000017600 R09: > 0000000000000080 > Oct 22 01:26:10 TEST kernel: R10: 0000000000000000 R11: 000000000000001b R12: > 00007f21f432e000 > Oct 22 01:26:10 TEST kernel: R13: 0000000000000032 R14: ffff8800000000a8 R15: > 0000000000000000 > Oct 22 01:26:10 TEST kernel: FS: 00007f21f4e56880(0000) > GS:ffff88085fd00000(0000) knlGS:0000000000000000 > Oct 22 01:26:10 TEST kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Oct 22 01:26:10 TEST kernel: CR2: 00007f21f432eee0 CR3: 0000000648323000 CR4: > 00000000000407e0 > Oct 22 01:26:10 TEST kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > Oct 22 01:26:10 TEST kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > Oct 22 01:26:10 TEST kernel: Stack: > Oct 22 01:26:10 TEST kernel: ffff8800287bbe98 ffffffff8163ce37 > ffff8800287bbeb8 ffffffffa03afe1b > Oct 22 01:26:10 TEST kernel: ffff88104aabb000 ffff88101a631800 > ffff8800287bbee8 ffffffffa03b6903 > Oct 22 01:26:10 TEST kernel: 0000000000000003 0000000000000300 > 0000000000000000 ffffffff81a25e00 > Oct 22 01:26:10 TEST kernel: Call Trace: > Oct 22 01:26:10 TEST kernel: [<ffffffff8163ce37>] _raw_write_lock+0x17/0x20 > Oct 22 01:26:10 TEST kernel: [<ffffffffa03afe1b>] ring_proc_add+0x1b/0xb0 > [pf_ring] > Oct 22 01:26:10 TEST kernel: [<ffffffffa03b6903>] ring_create+0x2e3/0x4a0 > [pf_ring] > Oct 22 01:26:10 TEST kernel: [<ffffffff815103e0>] __sock_create+0x110/0x260 > Oct 22 01:26:10 TEST kernel: [<ffffffff81511ab1>] SyS_socket+0x61/0xf0 > Oct 22 01:26:10 TEST kernel: [<ffffffff8163d948>] ? page_fault+0x28/0x30 > Oct 22 01:26:10 TEST kernel: [<ffffffff81645ec9>] > system_call_fastpath+0x16/0x1b > Oct 22 01:26:10 TEST kernel: Code: 66 90 48 89 01 31 c0 66 66 90 c3 b8 f2 ff > ff ff 66 66 90 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 f0 ff > 07 f3 90 <83> 3f 01 75 f9 f0 ff 0f 75 f1 5d c3 66 66 2e 0f 1f 84 00 00 00 > > Info in /proc: > > PF_RING Version : 6.4.1 > ((detached:eded7ce625e60620639050575b1fba0aa0412374) > Total rings : 16 > > Standard (non ZC) Options > Ring slots : 65534 > Slot version : 16 > Capture TX : No [RX only] > IP Defragment : No > Socket Mode : Standard > Total plugins : 0 > Cluster Fragment Queue : 0 > Cluster Fragment Discard : 0 > > cpu and memory info: > > Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz * 2 > > total used free shared buff/cache > available > Mem: 62G 33G 25G 784M 4.0G > 27G > Swap: 31G 0B 31G > > Any insight would be helpful. > _______________________________________________ > Ntop-misc mailing list > [email protected] > http://listgateway.unipi.it/mailman/listinfo/ntop-misc
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
