Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

Paweł Staszewski Sat, 10 Nov 2018 11:50:54 -0800



W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze:

On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski <pstaszew...@itcare.pl> 
wrote:

W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze:

CPU load is lower than for connectx4 - but it looks like bandwidth
limit is the same :)
But also after reaching 60Gbit/60Gbit

  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
   input: /proc/net/dev type: rate
   -         iface                   Rx Tx                Total
==========================================================================

          enp175s0:          45.09 Gb/s           15.09 Gb/s     60.18 Gb/s
          enp216s0:          15.14 Gb/s           45.19 Gb/s     60.33 Gb/s
--------------------------------------------------------------------------

             total:          60.45 Gb/s           60.48 Gb/s 120.93 Gb/s

Today reached 65/65Gbit/s

But starting from 60Gbit/s RX / 60Gbit TX nics start to drop packets
(with 50%CPU on all 28cores) - so still there is cpu power to use :).

This is weird!

How do you see / measure these drops?

Simple icmp test like ping -i 0.1

And im testing by icmp management ip address on vlan that is attacked toone NIC (the side that is more stressed with RX)

And another icmp test is forward thru this router - host behind it

Both measurements shows same loss ratio from 0.1 to 0.5% after reaching~45Gbit/s RX side - depends how much RX side is pushed drops varybetween 0.1 to 0.5 - even 0.6%:)

So checked other stats.
softnet_stats shows average 1k squeezed per sec:

Is below output the raw counters? not per sec?

It would be valuable to see the per sec stats instead...
I use this tool:
  
https://github.com/netoptimizer/network-testing/blob/master/bin/softnet_stat.pl

cpu      total    dropped   squeezed  collision        rps flow_limit
    0      18554          0          1          0          0 0
    1      16728          0          1          0          0 0
    2      18033          0          1          0          0 0
    3      17757          0          1          0          0 0
    4      18861          0          0          0          0 0
    5          0          0          1          0          0 0
    6          2          0          1          0          0 0
    7          0          0          1          0          0 0
    8          0          0          0          0          0 0
    9          0          0          1          0          0 0
   10          0          0          0          0          0 0
   11          0          0          1          0          0 0
   12         50          0          1          0          0 0
   13        257          0          0          0          0 0
   14 3629115363          0    3353259          0          0 0
   15  255167835          0    3138271          0          0 0
   16 4240101961          0    3036130          0          0 0
   17  599810018          0    3072169          0          0 0
   18  432796524          0    3034191          0          0 0
   19   41803906          0    3037405          0          0 0
   20  900382666          0    3112294          0          0 0
   21  620926085          0    3086009          0          0 0
   22   41861198          0    3023142          0          0 0
   23 4090425574          0    2990412          0          0 0
   24 4264870218          0    3010272          0          0 0
   25  141401811          0    3027153          0          0 0
   26  104155188          0    3051251          0          0 0
   27 4261258691          0    3039765          0          0 0
   28          4          0          1          0          0 0
   29          4          0          0          0          0 0
   30          0          0          1          0          0 0
   31          0          0          0          0          0 0
   32          3          0          1          0          0 0
   33          1          0          1          0          0 0
   34          0          0          1          0          0 0
   35          0          0          0          0          0 0
   36          0          0          1          0          0 0
   37          0          0          1          0          0 0
   38          0          0          1          0          0 0
   39          0          0          1          0          0 0
   40          0          0          0          0          0 0
   41          0          0          1          0          0 0
   42  299758202          0    3139693          0          0 0
   43 4254727979          0    3103577          0          0 0
   44 1959555543          0    2554885          0          0 0
   45 1675702723          0    2513481          0          0 0
   46 1908435503          0    2519698          0          0 0
   47 1877799710          0    2537768          0          0 0
   48 2384274076          0    2584673          0          0 0
   49 2598104878          0    2593616          0          0 0
   50 1897566829          0    2530857          0          0 0
   51 1712741629          0    2489089          0          0 0
   52 1704033648          0    2495892          0          0 0
   53 1636781820          0    2499783          0          0 0
   54 1861997734          0    2541060          0          0 0
   55 2113521616          0    2555673          0          0 0


So i rised netdev backlog and budged to rly high values
524288 for netdev_budget and same for backlog

Does it affect the squeezed counters?

a little - but not much

After change budget from 65536 to to 524k - number of squeezed countersfor all cpus changed from 1.5k per second to 0.9-1k per second - butincreasing it more like above 524k change nothing - same 0.9 to 1k/ssqueezed


Notice, this (crazy) huge netdev_budget limit will also be limited
by /proc/sys/net/core/netdev_budget_usecs.

Yes changed that also to 1000 / 2000 / 3000 / 4000 not much differenceon squeezed - even cant see the difference

This rised sortirqs from about 600k/sec to 800k/sec for NET_TX/NET_RX

Hmmm, this could indicated not enough NAPI bulking is occurring.

I have a BPF tool, that can give you some insight into NAPI bulking and
softirq idle/kthread starting. Called 'napi_monitor', could you try to
run this, so can try to understand this? You find the tool here:

  
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/
  
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/napi_monitor_user.c
  
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/napi_monitor_kern.c

yes will try it

But after this changes i have less packets drops.


Below perf top from max traffic reached:
     PerfTop:   72230 irqs/sec  kernel:99.4%  exact:  0.0% [4000Hz
cycles],  (all, 56 CPUs)
------------------------------------------------------------------------------------------

      12.62%  [kernel]       [k] mlx5e_skb_from_cqe_mpwrq_linear
       8.44%  [kernel]       [k] mlx5e_sq_xmit
       6.69%  [kernel]       [k] build_skb
       5.21%  [kernel]       [k] fib_table_lookup
       3.54%  [kernel]       [k] memcpy_erms
       3.20%  [kernel]       [k] mlx5e_poll_rx_cq
       2.25%  [kernel]       [k] vlan_do_receive
       2.20%  [kernel]       [k] mlx5e_post_rx_mpwqes
       2.02%  [kernel]       [k] mlx5e_handle_rx_cqe_mpwrq
       1.95%  [kernel]       [k] __dev_queue_xmit
       1.83%  [kernel]       [k] dev_gro_receive
       1.79%  [kernel]       [k] tcp_gro_receive
       1.73%  [kernel]       [k] ip_finish_output2
       1.63%  [kernel]       [k] mlx5e_poll_tx_cq
       1.49%  [kernel]       [k] ipt_do_table
       1.38%  [kernel]       [k] inet_gro_receive
       1.31%  [kernel]       [k] __netif_receive_skb_core
       1.30%  [kernel]       [k] _raw_spin_lock
       1.28%  [kernel]       [k] mlx5_eq_int
       1.24%  [kernel]       [k] irq_entries_start
       1.19%  [kernel]       [k] __build_skb
       1.15%  [kernel]       [k] swiotlb_map_page
       1.02%  [kernel]       [k] vlan_dev_hard_start_xmit
       0.94%  [kernel]       [k] pfifo_fast_dequeue
       0.92%  [kernel]       [k] ip_route_input_rcu
       0.86%  [kernel]       [k] kmem_cache_alloc
       0.80%  [kernel]       [k] mlx5e_xmit
       0.79%  [kernel]       [k] dev_hard_start_xmit
       0.78%  [kernel]       [k] _raw_spin_lock_irqsave
       0.74%  [kernel]       [k] ip_forward
       0.72%  [kernel]       [k] tasklet_action_common.isra.21
       0.68%  [kernel]       [k] pfifo_fast_enqueue
       0.67%  [kernel]       [k] netif_skb_features
       0.66%  [kernel]       [k] skb_segment
       0.60%  [kernel]       [k] skb_gro_receive
       0.56%  [kernel]       [k] validate_xmit_skb.isra.142
       0.53%  [kernel]       [k] skb_release_data
       0.51%  [kernel]       [k] mlx5e_page_release
       0.51%  [kernel]       [k] ip_rcv_core.isra.20.constprop.25
       0.51%  [kernel]       [k] __qdisc_run
       0.50%  [kernel]       [k] tcp4_gro_receive
       0.49%  [kernel]       [k] page_frag_free
       0.46%  [kernel]       [k] kmem_cache_free_bulk
       0.43%  [kernel]       [k] kmem_cache_free
       0.42%  [kernel]       [k] try_to_wake_up
       0.39%  [kernel]       [k] _raw_spin_lock_irq
       0.39%  [kernel]       [k] find_busiest_group
       0.37%  [kernel]       [k] __memcpy



Remember those tests are now on two separate connectx5 connected to
two separate pcie x16  gen 3.0

That is strange... I still suspect some HW NIC issue, can you provide

ethtool stats info via tool:

  
https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl

$ ethtool_stats.pl --dev enp175s0 --dev enp216s0

The tool remove zero-stats counters and report per sec stats.  It makes
it easier to spot that is relevant for the given workload.

yes mlnx have just too many counters that are always 0 for my case :)
Will try this also


Can you give output put from:
  $ ethtool --show-priv-flag DEVICE

I want you to experiment with:

ethtool --show-priv-flags enp175s0
Private flags for enp175s0:
rx_cqe_moder       : on
tx_cqe_moder       : off
rx_cqe_compress    : off
rx_striding_rq     : on
rx_no_csum_complete: off


  ethtool --set-priv-flags DEVICE rx_striding_rq off

ok i will first check on test server if this will reset my interface andwill not produce kernel panic :)


I think you already have played with 'rx_cqe_compress', right.

yes - and compress increasing number of irq's but doing not much forbandwidth same limit 60-64Gbit/s total RX+TX on one 100G port

And what is weird - that limit is in overall symetric - cause if forexample 100G port is receiving 42G traffic and transmitting 20G traffic- when i flood rx side with pktgen or other for example icmp traffic1/2/3/4/5G - then receiving side increase with 1/2/3/4/5Gbit of trafficbut transmitting is going down for same lvl's

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

Reply via email to