Hi Alfredo,
Sorry, I forgot to attach the files as you said. I sent them awhile ago,
but it seems the mail size is over the limit and get held for approval. I'm
trying now deleting some info from my first email and pasting one file at a
time.
- 0.log = top output for the scenario in my fist email.
El mié., 27 jun. 2018 a las 14:30, Alfredo Cardigliano (<
[email protected]>) escribió:
> Hi David
>
> On 27 Jun 2018, at 14:20, David Notivol <[email protected]> wrote:
>
> Hi Alfredo,
> Thanks for your recommendations.
>
> I tested using core affinity as you suggested, and the in drops
> disappeared in zbalance. The output drops persist, but the absolute drops
> are less than before.
> Actually I had tested the core affinity, but I didn't have in mind the
> physical cores. Now I put zbalance in one physical core, and 10 nprobe
> instances not sharing the physical core with zbalance.
>
> About your point 2, by using zc drivers, how could I run several nprobe
> instances to share the load? I'm testing with one instance: -i
> zc:p2p1,zc:p2p2
>
>
> You can keep using zbalance_ipc (-i zc:p2p1,zc:p2p2), or you can use RSS
> (running nprobe on -i zc:p2p1@<id>,zc:p2p2@<id>)
>
> Attached you can find:
> - 0.log = top output for the scenario in my previous email.
> - 1.log = scenario in your point 1, including top, zbalance output, and
> nprobe stats.
>
>
>
> I do not see the attachments, did you forget to enclose them?
>
> Alfredo
>
>
> El mié., 27 jun. 2018 a las 12:13, Alfredo Cardigliano (<
> [email protected]>) escribió:
>
>> Hi David
>> it seems that you have packet loss both on zbalance and nprobe,
>> I recommend you to:
>> 1. set the core affinity for both zbalance_ipc and the nprobe instances,
>> trying to
>> use a different core for each (at least do not share the zbalance_ipc
>> physical core
>> with nprobe instances)
>> 2. did you try using zc drivers for capturing traffic from the
>> interfaces? (zc:p2p1,zc:p2p2)
>> Please also provide the top output (press 1 to see all cored) with the
>> current configuration,
>> I guess kernel is using some of the available cpu with this configuration.
>>
>> Alfredo
>>
>> On 26 Jun 2018, at 16:31, David Notivol <[email protected]> wrote:
>>
>> Hi Alfredo,
>> Thanks for replying.
>> This is an excerpt of the zbalance and nprobe statistics:
>>
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:265] =========================
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:266] Absolute Stats: Recv
>> 1'285'430'239 pkts (1'116'181'903 drops) - Forwarded 1'266'272'285 pkts
>> (19'157'949 drops)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:305] p2p1,p2p2 RX
>> 1285430267 pkts Dropped 1116181981 pkts (46.5 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 0 RX 77050882
>> pkts Dropped 1127883 pkts (1.4 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 1 RX 70722562
>> pkts Dropped 756409 pkts (1.1 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 2 RX 76092418
>> pkts Dropped 1017335 pkts (1.3 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 3 RX 75088386
>> pkts Dropped 896678 pkts (1.2 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 4 RX 91991042
>> pkts Dropped 2114739 pkts (2.2 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 5 RX 81384450
>> pkts Dropped 1269385 pkts (1.5 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 6 RX 84310018
>> pkts Dropped 1801848 pkts (2.1 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 7 RX 84554242
>> pkts Dropped 1487329 pkts (1.7 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 8 RX 84090370
>> pkts Dropped 1482864 pkts (1.7 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 9 RX 73642498
>> pkts Dropped 732237 pkts (1.0 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 10 RX
>> 76481026 pkts Dropped 1000496 pkts (1.3 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 11 RX
>> 72496642 pkts Dropped 929049 pkts (1.3 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 12 RX
>> 79386626 pkts Dropped 1122169 pkts (1.4 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 13 RX
>> 79418370 pkts Dropped 1187172 pkts (1.5 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 14 RX
>> 80284162 pkts Dropped 1195559 pkts (1.5 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:319] Q 15 RX
>> 79143426 pkts Dropped 1036797 pkts (1.3 %)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:338] Actual Stats: Recv 369'127.51
>> pps (555'069.74 drops) - Forwarded 369'129.51 pps (0.00 drops)
>> 26/Jun/2018 17:29:58 [zbalance_ipc.c:348] =========================
>>
>>
>> # cat /proc/net/pf_ring/stats/*
>> ClusterId: 1
>> TotQueues: 16
>> Applications: 1
>> App0Queues: 16
>> Duration: 0:00:41:18:386
>> Packets: 1191477340
>> Forwarded: 1174033613
>> Processed: 1173893301
>> IFPackets: 1191477364
>> IFDropped: 1036448041
>>
>> Duration: 0:00:41:15:587
>> Bytes: 42626434538
>> Packets: 71510530
>> Dropped: 845465
>>
>> [removed to make the mail smaller]
>
>> El mar., 26 jun. 2018 a las 16:25, Alfredo Cardigliano (<
>> [email protected]>) escribió:
>>
>>> Hi David
>>> please also provide statistics from zbalance_ipc (output or log file)
>>> and nprobe (you can get live stats from /proc/net/pf_ring/stats/)
>>>
>>> Thank you
>>> Alfredo
>>>
>>> On 26 Jun 2018, at 15:32, David Notivol <[email protected]> wrote:
>>>
>>> Hello list,
>>>
>>> We're using nProbe to export flows information to kafka. We're listening
>>> from two 10Gb interfaces that we merge with zbalance_ipc, and we split them
>>> into 16 queues to have 16 nprobe instances.
>>>
>>> The problem is we are seeing about 40% packet drops reported by
>>> zbalance_ipc, so it looks like nprobe is not capable of reading and
>>> processing all the traffic. The CPU usage is really high, and the load
>>> average is over 25-30.
>>>
>>> Merging both interfaces we're getting up to 5.5 Gbps, and 1.2 million
>>> packets / second; and we're using i40e_zc driver.
>>>
>>> Do you have any advice to try to improve this performance?
>>> Does it make sense we're having packet drops with this amount of
>>> traffic, and we're reaching the server limits? Or is any configuration we
>>> could tune up to improve it?
>>>
>>> Thanks in advance.
>>>
>>>
>>>
>>> -- System:
>>>
>>> nProbe: nProbe v.8.5.180625 (r6185)
>>> System RAM: 64GB
>>> System CPU: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 12 cores (6
>>> cores, 2 threads per core)
>>> System OS: CentOS Linux release 7.4.1708 (Core)
>>> Linux Kernel: 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58
>>> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> -- zbalance configuration:
>>>
>>> zbalance_ipc -i p2p1,p2p2 -c 1 -n 16 -m 4 -a -p -l /var/tmp/zbalance.log
>>> -v -w
>>>
>>> -- nProbe configuration:
>>>
>>> --interface=zc:1@0
>>> --pid-file=/var/run/nprobe-zc1-00.pid
>>> --dump-stats=/var/log/nprobe/zc1-00_flows_stats.txt
>>> --kafka "192.168.0.1:9092,192.168.0.2:9092,192.168.0.3:9092;topic"
>>> --collector=none
>>> --idle-timeout=60
>>> --snaplen=128
>>>
>>> [removed to make the mail smaller]
top - 13:48:00 up 7 days, 21:21, 4 users, load average: 28.10, 27.23, 26.96
Tasks: 216 total, 3 running, 213 sleeping, 0 stopped, 0 zombie
%Cpu0 : 59.7 us, 19.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 21.3 si, 0.0 st
%Cpu1 : 56.1 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 27.2 si, 0.0 st
%Cpu2 : 57.5 us, 18.3 sy, 0.0 ni, 0.7 id, 0.0 wa, 0.0 hi, 23.6 si, 0.0 st
%Cpu3 : 61.9 us, 15.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 22.2 si, 0.0 st
%Cpu4 : 60.7 us, 17.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 22.3 si, 0.0 st
%Cpu5 : 55.5 us, 18.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 25.6 si, 0.0 st
%Cpu6 : 57.3 us, 18.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 23.8 si, 0.0 st
%Cpu7 : 56.6 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 26.8 si, 0.0 st
%Cpu8 : 58.5 us, 18.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 22.6 si, 0.0 st
%Cpu9 : 61.5 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 21.9 si, 0.0 st
%Cpu10 : 60.8 us, 18.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 20.9 si, 0.0 st
%Cpu11 : 58.1 us, 19.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 22.3 si, 0.0 st
KiB Mem : 65688384 total, 6823948 free, 37809664 used, 21054772 buff/cache
KiB Swap: 6142972 total, 5945460 free, 197512 used. 27170612 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
145120 nobody 20 0 5216912 2.125g 6448 S 98.7 3.4 836:35.62 nprobe
145216 nobody 20 0 5216912 2.126g 6428 S 96.4 3.4 839:41.63 nprobe
145566 nobody 20 0 5216912 2.125g 6376 S 90.4 3.4 840:54.27 nprobe
145516 nobody 20 0 5347984 2.232g 6376 S 88.1 3.6 838:05.06 nprobe
145088 nobody 20 0 5216912 2.134g 6460 S 84.8 3.4 836:08.20 nprobe
145313 nobody 20 0 5413520 2.325g 6368 S 80.1 3.7 837:53.78 nprobe
145454 nobody 20 0 5413520 2.310g 6504 S 75.5 3.7 839:07.93 nprobe
145613 nobody 20 0 5216912 2.124g 6556 S 73.8 3.4 837:07.96 nprobe
145755 nobody 20 0 5216912 2.135g 6384 S 72.5 3.4 837:33.70 nprobe
145660 nobody 20 0 5216912 2.153g 6456 S 70.5 3.4 838:37.55 nprobe
145696 nobody 20 0 5216912 2.139g 6492 S 69.2 3.4 839:15.34 nprobe
145360 nobody 20 0 5216912 2.148g 6508 S 66.6 3.4 838:06.04 nprobe
145056 nobody 20 0 5282448 2.205g 6372 S 65.6 3.5 837:29.82 nprobe
145407 nobody 20 0 5216912 2.115g 6488 S 63.9 3.4 835:45.05 nprobe
145269 nobody 20 0 5282448 2.200g 6376 S 55.0 3.5 836:00.80 nprobe
144957 root 20 0 2221716 40416 37044 S 34.1 0.1 662:27.45 zbalance_ipc
1534 root 20 0 6472 484 484 S 3.3 0.0 222:30.93 rngd
38 root 20 0 0 0 0 S 1.3 0.0 32:36.72 ksoftirqd/6
33 root 20 0 0 0 0 R 1.0 0.0 30:57.76 ksoftirqd/5
18 root 20 0 0 0 0 S 0.7 0.0 25:16.69 ksoftirqd/2
23 root 20 0 0 0 0 S 0.7 0.0 35:03.37 ksoftirqd/3
186085 root 20 0 165076 11196 2476 S 0.7 0.0 0:04.95 perl
3 root 20 0 0 0 0 S 0.3 0.0 33:01.20 ksoftirqd/0
9 root 20 0 0 0 0 S 0.3 0.0 25:34.02 rcu_sched
13 root 20 0 0 0 0 S 0.3 0.0 25:02.86 ksoftirqd/1
28 root 20 0 0 0 0 S 0.3 0.0 25:41.38 ksoftirqd/4
43 root 20 0 0 0 0 S 0.3 0.0 29:07.07 ksoftirqd/7
48 root 20 0 0 0 0 R 0.3 0.0 30:18.45 ksoftirqd/8
53 root 20 0 0 0 0 S 0.3 0.0 35:08.19 ksoftirqd/9
58 root 20 0 0 0 0 S 0.3 0.0 27:41.74 ksoftirqd/10
63 root 20 0 0 0 0 S 0.3 0.0 27:27.80 ksoftirqd/11
186696 root 20 0 0 0 0 S 0.3 0.0 0:00.01 kworker/4:2
186785 root 20 0 157848 2356 1544 R 0.3 0.0 0:00.03 top
1 root 20 0 52424 3544 2160 S 0.0 0.0 0:35.99 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:05.01 kthreadd
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root rt 0 0 0 0 S 0.0 0.0 0:13.38 migration/0
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc