Hi Bowen Li
any chance you can move to latest stable and check if you are still able to 
reproduce it?

Thank you
Alfredo

> On 30 Oct 2017, at 08:15, Bowen Li <[email protected]> wrote:
> 
> Hi all,
>     Recently,I run bro cluster with PF_RING 6.4.1 as the package capturing 
> framework. However, I found a strange scene: sometimes when the cluster 
> restarts, the kernel module of PF_RING will block. The problem seems to 
> generate when operates /proc to apply read-write lock, causing kernel CPU 
> soft lockup and then the whole server is stuck. I want to know whether the 
> deadlock is caused by the resource competition when the cluster progress 
> applies read-write lock at the same time, or other reasons.
>     Here is the log in /var/log/message:
> 
> Oct 22 01:26:10 TEST kernel: BUG: soft lockup - CPU#16 stuck for 22s! 
> [bro:17375]
> Oct 22 01:26:10 TEST kernel: Modules linked in: binfmt_misc xt_CHECKSUM 
> iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
> nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
> tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
> iptable_filter intel_powerclamp coretemp intel_rapl kvm_intel iTCO_wdt kvm 
> mei_me iTCO_vendor_support mxm_wmi mei lpc_ich ipmi_ssif sb_edac crc32_pclmul 
> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper 
> edac_core ipmi_si cryptd ioTESTma ipmi_msghandler shpchp wmi sg dca i2c_i801 
> pcspkr mfd_core nfsd auth_rpcgss nfs_acl lockd grace pf_ring(OE) sunrpc 
> ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 
> syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm 
> crct10dif_pclmul crct10dif_common
> Oct 22 01:26:10 TEST kernel: crc32c_intel drm isci e1000e libsas ahci 
> scsi_transport_sas libahci ptp i2c_core libata pps_core ntb
> Oct 22 01:26:10 TEST kernel: CPU: 16 PID: 17375 Comm: bro Tainted: G          
>  OE  ------------   3.10.0-327.13.1.el7.x86_64 #1
> Oct 22 01:26:10 TEST kernel: Hardware name: Supermicro 
> X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
> Oct 22 01:26:10 TEST kernel: task: ffff880fdad73980 ti: ffff8800287b8000 
> task.ti: ffff8800287b8000
> Oct 22 01:26:10 TEST kernel: RIP: 0010:[<ffffffff81301909>]  
> [<ffffffff81301909>] __write_lock_failed+0x9/0x20
> Oct 22 01:26:10 TEST kernel: RSP: 0018:ffff8800287bbe88  EFLAGS: 00000297
> Oct 22 01:26:10 TEST kernel: RAX: 0000000000000000 RBX: 0000000092e79345 RCX: 
> 0000000000000000
> Oct 22 01:26:10 TEST kernel: RDX: 0000000000000000 RSI: ffff88104aabb000 RDI: 
> ffffffffa03c6324
> Oct 22 01:26:10 TEST kernel: RBP: ffff8800287bbe88 R08: 0000000000017600 R09: 
> 0000000000000080
> Oct 22 01:26:10 TEST kernel: R10: 0000000000000000 R11: 000000000000001b R12: 
> 00007f21f432e000
> Oct 22 01:26:10 TEST kernel: R13: 0000000000000032 R14: ffff8800000000a8 R15: 
> 0000000000000000
> Oct 22 01:26:10 TEST kernel: FS:  00007f21f4e56880(0000) 
> GS:ffff88085fd00000(0000) knlGS:0000000000000000
> Oct 22 01:26:10 TEST kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Oct 22 01:26:10 TEST kernel: CR2: 00007f21f432eee0 CR3: 0000000648323000 CR4: 
> 00000000000407e0
> Oct 22 01:26:10 TEST kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> Oct 22 01:26:10 TEST kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> Oct 22 01:26:10 TEST kernel: Stack:
> Oct 22 01:26:10 TEST kernel: ffff8800287bbe98 ffffffff8163ce37 
> ffff8800287bbeb8 ffffffffa03afe1b
> Oct 22 01:26:10 TEST kernel: ffff88104aabb000 ffff88101a631800 
> ffff8800287bbee8 ffffffffa03b6903
> Oct 22 01:26:10 TEST kernel: 0000000000000003 0000000000000300 
> 0000000000000000 ffffffff81a25e00
> Oct 22 01:26:10 TEST kernel: Call Trace:
> Oct 22 01:26:10 TEST kernel: [<ffffffff8163ce37>] _raw_write_lock+0x17/0x20
> Oct 22 01:26:10 TEST kernel: [<ffffffffa03afe1b>] ring_proc_add+0x1b/0xb0 
> [pf_ring]
> Oct 22 01:26:10 TEST kernel: [<ffffffffa03b6903>] ring_create+0x2e3/0x4a0 
> [pf_ring]
> Oct 22 01:26:10 TEST kernel: [<ffffffff815103e0>] __sock_create+0x110/0x260
> Oct 22 01:26:10 TEST kernel: [<ffffffff81511ab1>] SyS_socket+0x61/0xf0
> Oct 22 01:26:10 TEST kernel: [<ffffffff8163d948>] ? page_fault+0x28/0x30
> Oct 22 01:26:10 TEST kernel: [<ffffffff81645ec9>] 
> system_call_fastpath+0x16/0x1b
> Oct 22 01:26:10 TEST kernel: Code: 66 90 48 89 01 31 c0 66 66 90 c3 b8 f2 ff 
> ff ff 66 66 90 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 f0 ff 
> 07 f3 90 <83> 3f 01 75 f9 f0 ff 0f 75 f1 5d c3 66 66 2e 0f 1f 84 00 00 00
> 
>     Info in /proc:
> 
> PF_RING Version          : 6.4.1 
> ((detached:eded7ce625e60620639050575b1fba0aa0412374)
> Total rings              : 16
> 
> Standard (non ZC) Options
> Ring slots               : 65534
> Slot version             : 16
> Capture TX               : No [RX only]
> IP Defragment            : No
> Socket Mode              : Standard
> Total plugins            : 0
> Cluster Fragment Queue   : 0
> Cluster Fragment Discard : 0
> 
>    cpu and memory info:
> 
> Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz * 2
> 
>               total        used        free      shared  buff/cache   
> available
> Mem:            62G         33G         25G        784M        4.0G         
> 27G
> Swap:           31G          0B         31G
> 
>     Any insight would be helpful.
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Attachment: signature.asc
Description: Message signed with OpenPGP

_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Reply via email to