Hi David
your template is huge. Can you please omit (just for troubleshooting) 
"--flow-templ….” and report if you see changes in load?

Thanks Luca

> On 27 Jun 2018, at 08:43, David Notivol <[email protected]> wrote:
> 
> Hi,
> And now:
> - 1.log = scenario in your point 1, including top, zbalance output, and 
> nprobe stats.
> 
> El mié., 27 jun. 2018 a las 17:41, David Notivol (<[email protected] 
> <mailto:[email protected]>>) escribió:
> Hi Alfredo,
> 
> Sorry, I forgot to attach the files as you said. I sent them awhile ago, but 
> it seems the mail size is over the limit and get held for approval. I'm 
> trying now deleting some info from my first email and pasting one file at a 
> time.
>> - 0.log = top output for the scenario in my fist email.
> 
> 
> El mié., 27 jun. 2018 a las 14:30, Alfredo Cardigliano (<[email protected] 
> <mailto:[email protected]>>) escribió:
> Hi David
> 
>> On 27 Jun 2018, at 14:20, David Notivol <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi Alfredo,
>> Thanks for  your recommendations.
>> 
>> I tested using core affinity as you suggested, and the in drops disappeared 
>> in zbalance. The output drops persist, but the absolute drops are less than 
>> before.
>> Actually I had tested the core affinity, but I didn't have in mind the 
>> physical cores. Now I put zbalance in one physical core, and 10 nprobe 
>> instances not sharing the physical core with zbalance.
>> 
>> About your point 2, by using zc drivers, how could I run several nprobe 
>> instances to share the load? I'm testing with one instance: -i 
>> zc:p2p1,zc:p2p2
> 
> You can keep using zbalance_ipc (-i zc:p2p1,zc:p2p2), or you can use RSS 
> (running nprobe on  -i zc:p2p1@<id>,zc:p2p2@<id>)
> 
>> Attached you can find:
>> - 0.log = top output for the scenario in my previous email.
>> - 1.log = scenario in your point 1, including top, zbalance output, and 
>> nprobe stats.
> 
> 
> I do not see the attachments, did you forget to enclose them?
> 
> Alfredo
> 
>> 
>> El mié., 27 jun. 2018 a las 12:13, Alfredo Cardigliano 
>> (<[email protected] <mailto:[email protected]>>) escribió:
>> Hi David
>> it seems that you have packet loss both on zbalance and nprobe, 
>> I recommend you to:
>> 1. set the core affinity for both zbalance_ipc and the nprobe instances, 
>> trying to
>> use a different core for each (at least do not share the zbalance_ipc 
>> physical core
>> with nprobe instances)
>> 2. did you try using zc drivers for capturing traffic from the interfaces? 
>> (zc:p2p1,zc:p2p2)
>> Please also provide the top output (press 1 to see all cored) with the 
>> current configuration,
>> I guess kernel is using some of the available cpu with this configuration.
>> 
>> Alfredo
>> 
>>> On 26 Jun 2018, at 16:31, David Notivol <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hi Alfredo,
>>> Thanks for replying.
>>> This is an excerpt of the zbalance and nprobe statistics:
>>> 
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:265] =========================
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:266] Absolute Stats: Recv 
>>> 1'285'430'239 pkts (1'116'181'903 drops) - Forwarded 1'266'272'285 pkts 
>>> (19'157'949 drops)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:305]                 p2p1,p2p2 RX 
>>> 1285430267 pkts Dropped 1116181981 pkts (46.5 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 0 RX 77050882 
>>> pkts Dropped 1127883 pkts (1.4 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 1 RX 70722562 
>>> pkts Dropped 756409 pkts (1.1 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 2 RX 76092418 
>>> pkts Dropped 1017335 pkts (1.3 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 3 RX 75088386 
>>> pkts Dropped 896678 pkts (1.2 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 4 RX 91991042 
>>> pkts Dropped 2114739 pkts (2.2 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 5 RX 81384450 
>>> pkts Dropped 1269385 pkts (1.5 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 6 RX 84310018 
>>> pkts Dropped 1801848 pkts (2.1 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 7 RX 84554242 
>>> pkts Dropped 1487329 pkts (1.7 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 8 RX 84090370 
>>> pkts Dropped 1482864 pkts (1.7 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 9 RX 73642498 
>>> pkts Dropped 732237 pkts (1.0 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 10 RX 76481026 
>>> pkts Dropped 1000496 pkts (1.3 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 11 RX 72496642 
>>> pkts Dropped 929049 pkts (1.3 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 12 RX 79386626 
>>> pkts Dropped 1122169 pkts (1.4 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 13 RX 79418370 
>>> pkts Dropped 1187172 pkts (1.5 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 14 RX 80284162 
>>> pkts Dropped 1195559 pkts (1.5 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319]                 Q 15 RX 79143426 
>>> pkts Dropped 1036797 pkts (1.3 %)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:338] Actual Stats: Recv 369'127.51 pps 
>>> (555'069.74 drops) - Forwarded 369'129.51 pps (0.00 drops)
>>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:348] =========================
>>> 
>>> 
>>> # cat /proc/net/pf_ring/stats/*
>>> ClusterId:         1
>>> TotQueues:         16
>>> Applications:      1
>>> App0Queues:        16
>>> Duration:          0:00:41:18:386
>>> Packets:           1191477340
>>> Forwarded:         1174033613
>>> Processed:         1173893301
>>> IFPackets:         1191477364
>>> IFDropped:         1036448041
>>> 
>>> Duration: 0:00:41:15:587
>>> Bytes:    42626434538
>>> Packets:  71510530
>>> Dropped:  845465
>>> 
> 
>  [removed to make the mail smaller] 
>>> 
>>> El mar., 26 jun. 2018 a las 16:25, Alfredo Cardigliano 
>>> (<[email protected] <mailto:[email protected]>>) escribió:
>>> Hi David
>>> please also provide statistics from zbalance_ipc (output or log file) 
>>> and nprobe (you can get live stats from /proc/net/pf_ring/stats/)
>>> 
>>> Thank you
>>> Alfredo
>>> 
>>>> On 26 Jun 2018, at 15:32, David Notivol <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Hello list,
>>>> 
>>>> We're using nProbe to export flows information to kafka. We're listening 
>>>> from two 10Gb interfaces that we merge with zbalance_ipc, and we split 
>>>> them into 16 queues to have 16 nprobe instances.
>>>> 
>>>> The problem is we are seeing about 40% packet drops reported by 
>>>> zbalance_ipc, so it looks like nprobe is not capable of reading and 
>>>> processing all the traffic. The CPU usage is really high, and the load 
>>>> average is over 25-30.
>>>> 
>>>> Merging both interfaces we're getting up to 5.5 Gbps, and  1.2 million 
>>>> packets / second; and we're using i40e_zc driver. 
>>>> 
>>>> Do you have any advice to try to improve this performance?
>>>> Does it make sense we're having packet drops with this amount of traffic, 
>>>> and we're reaching the server limits? Or is any configuration we could 
>>>> tune up to improve it?
>>>> 
>>>> Thanks in advance.
>>>> 
>>>> 
>>>> 
>>>> -- System:
>>>> 
>>>> nProbe:          nProbe v.8.5.180625 (r6185)
>>>> System RAM: 64GB
>>>> System CPU:  Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 12 cores (6 cores, 
>>>>  2 threads per core)
>>>> System OS:    CentOS Linux release 7.4.1708 (Core)
>>>> Linux Kernel:   3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 
>>>> 2018 x86_64 x86_64 x86_64 GNU/Linux
>>>> 
>>>> -- zbalance configuration:
>>>> 
>>>> zbalance_ipc -i p2p1,p2p2 -c 1 -n 16 -m 4 -a -p -l /var/tmp/zbalance.log 
>>>> -v -w
>>>> 
>>>> -- nProbe configuration:
>>>> 
>>>> --interface=zc:1@0
>>>> --pid-file=/var/run/nprobe-zc1-00.pid
>>>> --dump-stats=/var/log/nprobe/zc1-00_flows_stats.txt
>>>> --kafka "192.168.0.1:9092 <http://192.168.0.1:9092/>,192.168.0.2:9092 
>>>> <http://192.168.0.2:9092/>,192.168.0.3:9092;topic"
>>>> --collector=none
>>>> --idle-timeout=60
>>>> --snaplen=128
>>>> 
> 
> [removed to make the mail smaller] 
> 
> 
> 
> -- 
> Saludos,
> David Notivol
> [email protected] <mailto:[email protected]>
> <1.log>_______________________________________________
> Ntop-misc mailing list
> [email protected] <mailto:[email protected]>
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc 
> <http://listgateway.unipi.it/mailman/listinfo/ntop-misc>
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Reply via email to