W dniu 11.11.2018 o 09:56, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 22:53:53 +0100 Paweł Staszewski <pstaszew...@itcare.pl> 
wrote:

Now im messing with ring configuration for connectx5 nics.
And after reading that paper:
https://netdevconf.org/2.1/slides/apr6/network-performance/04-amir-RX_and_TX_bulking_v2.pdf

Do notice that some of the ideas in that slide deck, was never
implemented. But they are still on my todo list ;-).

Notice how that it show that TX bulking is very important, but based on
your ethtool_stats.pl, I can see that not much TX bulking is happening
in your case.  This is indicated via the xmit_more counters.

  Ethtool(enp175s0) stat:    2630 (     2,630) <= tx_xmit_more /sec
  Ethtool(enp175s0) stat: 4956995 ( 4,956,995) <= tx_packets /sec

And the per queue levels are also avail:

  Ethtool(enp175s0) stat: 184845 ( 184,845) <= tx7_packets /sec
  Ethtool(enp175s0) stat:     78 (      78) <= tx7_xmit_more /sec

This means that you are doing too many doorbell's to the NIC hardware
at TX time, which I worry could be what cause the NIC and PCIe hardware
not to operate at optimal speeds.

After tunning coal/ring a little with ethtool

Reached today:

 bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
  input: /proc/net/dev type: rate
  |         iface                   Rx Tx                Total
==============================================================================
         enp175s0:          50.68 Gb/s           21.53 Gb/s           72.20 Gb/s          enp216s0:          21.62 Gb/s           50.81 Gb/s           72.42 Gb/s
------------------------------------------------------------------------------
            total:          72.30 Gb/s           72.33 Gb/s          144.63 Gb/s



And still no packet loss (icmp side to side test every 100ms)

Below perf top


   PerfTop:  104692 irqs/sec  kernel:99.5%  exact:  0.0% [4000Hz cycles],  (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     9.06%  [kernel]       [k] mlx5e_skb_from_cqe_mpwrq_linear
     6.43%  [kernel]       [k] tasklet_action_common.isra.21
     5.68%  [kernel]       [k] fib_table_lookup
     4.89%  [kernel]       [k] irq_entries_start
     4.53%  [kernel]       [k] mlx5_eq_int
     4.10%  [kernel]       [k] build_skb
     3.39%  [kernel]       [k] mlx5e_poll_tx_cq
     3.38%  [kernel]       [k] mlx5e_sq_xmit
     2.73%  [kernel]       [k] mlx5e_poll_rx_cq
     2.18%  [kernel]       [k] __dev_queue_xmit
     2.13%  [kernel]       [k] vlan_do_receive
     2.12%  [kernel]       [k] mlx5e_handle_rx_cqe_mpwrq
     2.00%  [kernel]       [k] ip_finish_output2
     1.87%  [kernel]       [k] mlx5e_post_rx_mpwqes
     1.86%  [kernel]       [k] memcpy_erms
     1.85%  [kernel]       [k] ipt_do_table
     1.70%  [kernel]       [k] dev_gro_receive
     1.39%  [kernel]       [k] __netif_receive_skb_core
     1.31%  [kernel]       [k] inet_gro_receive
     1.21%  [kernel]       [k] ip_route_input_rcu
     1.21%  [kernel]       [k] tcp_gro_receive
     1.13%  [kernel]       [k] _raw_spin_lock
     1.08%  [kernel]       [k] __build_skb
     1.06%  [kernel]       [k] kmem_cache_free_bulk
     1.05%  [kernel]       [k] __softirqentry_text_start
     1.03%  [kernel]       [k] vlan_dev_hard_start_xmit
     0.98%  [kernel]       [k] pfifo_fast_dequeue
     0.95%  [kernel]       [k] mlx5e_xmit
     0.95%  [kernel]       [k] page_frag_free
     0.88%  [kernel]       [k] ip_forward
     0.81%  [kernel]       [k] dev_hard_start_xmit
     0.78%  [kernel]       [k] rcu_irq_exit
     0.77%  [kernel]       [k] netif_skb_features
     0.72%  [kernel]       [k] napi_complete_done
     0.72%  [kernel]       [k] kmem_cache_alloc
     0.68%  [kernel]       [k] validate_xmit_skb.isra.142
     0.66%  [kernel]       [k] ip_rcv_core.isra.20.constprop.25
     0.58%  [kernel]       [k] swiotlb_map_page
     0.57%  [kernel]       [k] __qdisc_run
     0.56%  [kernel]       [k] tasklet_action
     0.54%  [kernel]       [k] __get_xps_queue_idx
     0.54%  [kernel]       [k] inet_lookup_ifaddr_rcu
     0.50%  [kernel]       [k] tcp4_gro_receive
     0.49%  [kernel]       [k] skb_release_data
     0.47%  [kernel]       [k] eth_type_trans
     0.40%  [kernel]       [k] sch_direct_xmit
     0.40%  [kernel]       [k] net_rx_action
     0.39%  [kernel]       [k] __local_bh_enable_ip


And perf record/report

https://ufile.io/zguq0





So now i know what was causing cpu load for some processes like:

2913 root      20   0       0      0      0 I  10.3  0.0 6:58.29 kworker/u112:1-     7 root      20   0       0      0      0 I   8.6  0.0 6:17.18 kworker/u112:0- 10289 root      20   0       0      0      0 I   6.6  0.0 6:33.90 kworker/u112:4-  2939 root      20   0       0      0      0 R   3.6  0.0 7:37.68 kworker/u112:2-



After disabling adaptative tx for coalescense - all this processes gone.

lavg drops from 40 to 1

Current settings for coalescence:

ethtool -c enp175s0
Coalesce parameters for enp175s0:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
dmac: 32548

rx-usecs: 24
rx-frames: 256
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 0
tx-frames: 64
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

And currently with that traffiv lvls - have no packet loss (cpu is avg. 60% for all 28 cores)










Reply via email to