W dniu 11.11.2018 o 09:03, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 23:19:50 +0100
Paweł Staszewski <pstaszew...@itcare.pl> wrote:
W dniu 10.11.2018 o 23:06, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 20:56:02 +0100
Paweł Staszewski <pstaszew...@itcare.pl> wrote:
W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze:
W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze:
On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski
<pstaszew...@itcare.pl> wrote:
W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze:
[...]
Do notice, the per CPU squeeze is not too large.
Yes - but im searching invisible thing now :) something invisible is
slowing down packet processing :)
So trying to find any counter that have something to do with packet
processing.
NOTICE, I have given you the counters you need (below)
Yes noticed this :)
[...]
Remember those tests are now on two separate connectx5 connected to
two separate pcie x16 gen 3.0
That is strange... I still suspect some HW NIC issue, can you provide
ethtool stats info via tool:
https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
$ ethtool_stats.pl --dev enp175s0 --dev enp216s0
The tool remove zero-stats counters and report per sec stats. It makes
it easier to spot that is relevant for the given workload.
yes mlnx have just too many counters that are always 0 for my case :)
Will try this also
But still alot of non 0 counters
Show adapter(s) (enp175s0 enp216s0) statistics (ONLY that changed!)
Ethtool(enp175s0) stat: 8891 ( 8,891) <= ch0_arm /sec
[...]
I have copied the stats over in another document so I can better looks
at it... and I've found some interesting stats.
E.g. we can see that the NIC hardware is dropping packets.
RX-drops on enp175s0:
(enp175s0) stat: 4850734036 ( 4,850,734,036) <= rx_bytes /sec
(enp175s0) stat: 5069043007 ( 5,069,043,007) <= rx_bytes_phy /sec
-218308971 ( -218,308,971) Dropped bytes /sec
(enp175s0) stat: 139602 ( 139,602) <= rx_discards_phy /sec
(enp175s0) stat: 3717148 ( 3,717,148) <= rx_packets /sec
(enp175s0) stat: 3862420 ( 3,862,420) <= rx_packets_phy /sec
-145272 ( -145,272) Dropped packets /sec
RX-drops on enp216s0 is less:
(enp216s0) stat: 2592286809 ( 2,592,286,809) <= rx_bytes /sec
(enp216s0) stat: 2633575771 ( 2,633,575,771) <= rx_bytes_phy /sec
-41288962 ( -41,288,962) Dropped bytes /sec
(enp216s0) stat: 464 (464) <= rx_discards_phy /sec
(enp216s0) stat: 4971677 ( 4,971,677) <= rx_packets /sec
(enp216s0) stat: 4975563 ( 4,975,563) <= rx_packets_phy /sec
-3886 ( -3,886) Dropped packets /sec
I would recommend, that you use ethtool stats and monitor rx_discards_phy.
The PHY are the counters from the hardware, and it shows that packets
are getting dropped at HW level. This can be because software is not
fast enough to empty RX-queue, but in this case where CPUs are mostly
idle I don't think that is the case.
That is why i was searching some counter for software - where is
something wrong.
Cause in earlier reports from ethtool there was also phy drops reported
- just when cpu's was saturated that was normal for me that phy can drop
packets if no more cpu cycles available to pickup them from hw
But in case where i have 50% idle cpu's - there should be no problem -
that is why i start to modify ethtool params for tx/rx ring and coalescence
Currently waiting for more traffic with new ethtool settings:
ethtool -g enp175s0
Ring parameters for enp175s0:
Pre-set maximums:
RX: 8192
RX Mini: 0
RX Jumbo: 0
TX: 8192
Current hardware settings:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 128
ethtool -c enp175s0
Coalesce parameters for enp175s0:
Adaptive RX: off TX: on
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
dmac: 32517
rx-usecs: 64
rx-frames: 128
rx-usecs-irq: 0
rx-frames-irq: 0
tx-usecs: 8
tx-frames: 128
tx-usecs-irq: 0
tx-frames-irq: 0
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0
Both ports same settings.
Current traffic:
bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
input: /proc/net/dev type: rate
| iface Rx Tx Total
==============================================================================
enp175s0: 37.85 Gb/s 7.77 Gb/s
45.62 Gb/s
enp216s0: 7.80 Gb/s 37.90 Gb/s
45.70 Gb/s
------------------------------------------------------------------------------
total: 45.61 Gb/s 45.63 Gb/s
91.24 Gb/s
and mpstat for cpu's
Average: CPU %usr %nice %sys %iowait %irq %soft %steal
%guest %gnice %idle
Average: all 0.33 0.00 1.48 0.01 0.00 12.11 0.00
0.00 0.00 86.06
Average: 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 1 0.00 0.00 0.90 0.00 0.00 0.00 0.00
0.00 0.00 99.10
Average: 2 0.10 0.00 0.20 0.80 0.00 0.00 0.00
0.00 0.00 98.90
Average: 3 0.10 0.00 0.30 0.00 0.00 0.00 0.00
0.00 0.00 99.60
Average: 4 14.10 0.00 1.00 0.00 0.00 0.00 0.00
0.00 0.00 84.90
Average: 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 6 0.00 0.00 1.50 0.00 0.00 0.00 0.00
0.00 0.00 98.50
Average: 7 0.20 0.00 2.00 0.00 0.00 0.00 0.00
0.00 0.00 97.80
Average: 8 0.10 0.00 0.40 0.00 0.00 0.00 0.00
0.00 0.00 99.50
Average: 9 0.00 0.00 0.60 0.00 0.00 0.00 0.00
0.00 0.00 99.40
Average: 10 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 11 0.00 0.00 5.60 0.00 0.00 0.00 0.00
0.00 0.00 94.40
Average: 12 0.00 0.00 4.10 0.00 0.00 0.00 0.00
0.00 0.00 95.90
Average: 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 14 0.00 0.00 1.90 0.00 0.00 27.30 0.00
0.00 0.00 70.80
Average: 15 0.00 0.00 2.10 0.00 0.00 26.00 0.00
0.00 0.00 71.90
Average: 16 0.00 0.00 2.10 0.00 0.00 25.40 0.00
0.00 0.00 72.50
Average: 17 0.20 0.00 1.80 0.00 0.00 23.10 0.00
0.00 0.00 74.90
Average: 18 0.00 0.00 2.00 0.00 0.00 25.50 0.00
0.00 0.00 72.50
Average: 19 0.00 0.00 1.90 0.00 0.00 20.20 0.00
0.00 0.00 77.90
Average: 20 0.10 0.00 1.00 0.00 0.00 26.90 0.00
0.00 0.00 72.00
Average: 21 0.10 0.00 2.80 0.00 0.00 24.70 0.00
0.00 0.00 72.40
Average: 22 0.80 0.00 3.30 0.00 0.00 24.30 0.00
0.00 0.00 71.60
Average: 23 0.10 0.00 1.80 0.00 0.00 26.60 0.00
0.00 0.00 71.50
Average: 24 0.10 0.00 1.20 0.00 0.00 23.60 0.00
0.00 0.00 75.10
Average: 25 0.00 0.00 1.80 0.00 0.00 26.60 0.00
0.00 0.00 71.60
Average: 26 0.00 0.00 1.50 0.00 0.00 26.70 0.00
0.00 0.00 71.80
Average: 27 0.10 0.00 0.70 0.00 0.00 26.70 0.00
0.00 0.00 72.50
Average: 28 0.70 0.00 0.30 0.00 0.00 0.00 0.00
0.00 0.00 99.00
Average: 29 0.20 0.00 1.50 0.00 0.00 0.00 0.00
0.00 0.00 98.30
Average: 30 0.10 0.00 0.60 0.00 0.00 0.00 0.00
0.00 0.00 99.30
Average: 31 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 32 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 33 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 34 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 35 0.10 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 99.90
Average: 36 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 37 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 38 0.00 0.00 2.80 0.00 0.00 0.00 0.00
0.00 0.00 97.20
Average: 39 0.00 0.00 7.40 0.00 0.00 0.00 0.00
0.00 0.00 92.60
Average: 40 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 41 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00
Average: 42 0.00 0.00 2.10 0.00 0.00 28.40 0.00
0.00 0.00 69.50
Average: 43 0.00 0.00 1.60 0.00 0.00 25.00 0.00
0.00 0.00 73.40
Average: 44 0.10 0.00 1.60 0.00 0.00 23.90 0.00
0.00 0.00 74.40
Average: 45 0.00 0.00 1.60 0.00 0.00 21.00 0.00
0.00 0.00 77.40
Average: 46 0.00 0.00 2.20 0.00 0.00 28.00 0.00
0.00 0.00 69.80
Average: 47 0.00 0.00 2.80 0.00 0.00 20.30 0.00
0.00 0.00 76.90
Average: 48 0.00 0.00 2.50 0.00 0.00 21.60 0.00
0.00 0.00 75.90
Average: 49 0.00 0.00 0.80 0.00 0.00 22.50 0.00
0.00 0.00 76.70
Average: 50 0.40 0.00 3.00 0.00 0.00 23.50 0.00
0.00 0.00 73.10
Average: 51 0.60 0.00 2.50 0.00 0.00 25.00 0.00
0.00 0.00 71.90
Average: 52 0.10 0.00 1.30 0.00 0.00 20.70 0.00
0.00 0.00 77.90
Average: 53 0.00 0.00 2.20 0.00 0.00 22.80 0.00
0.00 0.00 75.00
Average: 54 0.00 0.00 1.40 0.00 0.00 20.80 0.00
0.00 0.00 77.80
Average: 55 0.00 0.00 2.10 0.00 0.00 21.30 0.00
0.00 0.00 76.60