W dniu 10.11.2018 o 22:01, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 21:02:10 +0100
Paweł Staszewski <pstaszew...@itcare.pl> wrote:
W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze:
I want you to experiment with:
ethtool --set-priv-flags DEVICE rx_striding_rq off
just checked that previously connectx4 was have thos disabled:
ethtool --show-priv-flags enp175s0f0
Private flags for enp175s0f0:
rx_cqe_moder : on
tx_cqe_moder : off
rx_cqe_compress : off
rx_striding_rq : off
rx_no_csum_complete: off
The CX4 hardware does not have this feature (p.s. the CX4-Lx does).
So now we are on connectx5 and we have enabled - for sure connectx5
changed cpu load - where i have now max 50/60% cpu where with connectx4
there was sometimes near 100% with same configuration.
I (strongly) believe the CPU load was related to the page-alloactor
lock congestion, that Aaron fixed.
Yes i think both - most problems with cpu was due to page-allocator
problems.
But also after change connctx4 to connectx5 there is cpu load difference
- about 10% in total - but yes most of this like 40% is cause of Aaron
patch :) - rly good job :)
Now im messing with ring configuration for connectx5 nics.
And after reading that paper:
https://netdevconf.org/2.1/slides/apr6/network-performance/
04-amir-RX_and_TX_bulking_v2.pdf
changed from RX:8192 / TX: 4096 to RX:8192 / TX: 256
after this i gain about 5Gbit/s RX and TX traffic and less cpu load....
before change there was 59/59 Gbit/s
After change there is 64/64 Gbit/s
bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
input: /proc/net/dev type: rate
| iface Rx Tx Total
==============================================================================
enp175s0: 44.45 Gb/s 19.69 Gb/s
64.14 Gb/s
enp216s0: 19.69 Gb/s 44.49 Gb/s
64.19 Gb/s
------------------------------------------------------------------------------
total: 64.14 Gb/s 64.18 Gb/s 128.33 Gb/s