On Sun, 2018-01-28 at 19:26 +0100, Paweł Staszewski wrote: > > W dniu 27.01.2018 o 23:23, Paweł Staszewski pisze: > > Hi > > > > > > Today I made some real life traffic tests with kernel 4.15.0-rc9 > > > > but when traffic reach 50Gbit/s and about 6Mpps cpou load rises fast > > from 48% to 100% for all cpu cores. > > > > Here is some graph that presenting how cpu load rises when there was > > more pps. > > > > > > https://ibb.co/mhD5ob > > > > > > here is perf record from that time: > > > > https://pastebin.com/3zqG1rvE > > > > > > There is 8x 10G ixgbe 82599 interfaces teamed with teamd. > > > > No traffic queueing - only pfifo fast on all interfaces. > > > > No NAT or iptables forles other than INPUT (about 30rules) > > > > All nic's have same ethtool settings: > > > > ethtool -k eth0 > > Features for eth0: > > Cannot get device udp-fragmentation-offload settings: Operation not > > supported > > rx-checksumming: on > > tx-checksumming: on > > tx-checksum-ipv4: off [fixed] > > tx-checksum-ip-generic: on > > tx-checksum-ipv6: off [fixed] > > tx-checksum-fcoe-crc: off [fixed] > > tx-checksum-sctp: on > > scatter-gather: on > > tx-scatter-gather: on > > tx-scatter-gather-fraglist: off [fixed] > > tcp-segmentation-offload: on > > tx-tcp-segmentation: on > > tx-tcp-ecn-segmentation: off [fixed] > > tx-tcp-mangleid-segmentation: off > > tx-tcp6-segmentation: on > > udp-fragmentation-offload: off > > generic-segmentation-offload: on > > generic-receive-offload: on > > large-receive-offload: off > > rx-vlan-offload: on > > tx-vlan-offload: on > > ntuple-filters: on > > receive-hashing: on > > highdma: on [fixed] > > rx-vlan-filter: on > > vlan-challenged: off [fixed] > > tx-lockless: off [fixed] > > netns-local: off [fixed] > > tx-gso-robust: off [fixed] > > tx-fcoe-segmentation: off [fixed] > > tx-gre-segmentation: on > > tx-gre-csum-segmentation: on > > tx-ipxip4-segmentation: on > > tx-ipxip6-segmentation: on > > tx-udp_tnl-segmentation: on > > tx-udp_tnl-csum-segmentation: on > > tx-gso-partial: on > > tx-sctp-segmentation: off [fixed] > > tx-esp-segmentation: off [fixed] > > fcoe-mtu: off [fixed] > > tx-nocache-copy: off > > loopback: off [fixed] > > rx-fcs: off [fixed] > > rx-all: off > > tx-vlan-stag-hw-insert: off [fixed] > > rx-vlan-stag-hw-parse: off [fixed] > > rx-vlan-stag-filter: off [fixed] > > l2-fwd-offload: off > > hw-tc-offload: off > > esp-hw-offload: off [fixed] > > esp-tx-csum-hw-offload: off [fixed] > > rx-udp_tunnel-port-offload: on > > > > > > ethtool -g eth0 > > Ring parameters for eth0: > > Pre-set maximums: > > RX: 4096 > > RX Mini: 0 > > RX Jumbo: 0 > > TX: 4096 > > Current hardware settings: > > RX: 4096 > > RX Mini: 0 > > RX Jumbo: 0 > > TX: 2048 > > > > > > ethtool -c eth0 > > Coalesce parameters for eth0: > > Adaptive RX: off TX: off > > stats-block-usecs: 0 > > sample-interval: 0 > > pkt-rate-low: 0 > > pkt-rate-high: 0 > > > > rx-usecs: 512 > > rx-frames: 0 > > rx-usecs-irq: 0 > > rx-frames-irq: 0 > > > > tx-usecs: 0 > > tx-frames: 0 > > tx-usecs-irq: 0 > > tx-frames-irq: 0 > > > > rx-usecs-low: 0 > > rx-frame-low: 0 > > tx-usecs-low: 0 > > tx-frame-low: 0 > > > > rx-usecs-high: 0 > > rx-frame-high: 0 > > tx-usecs-high: 0 > > tx-frame-high: 0 > > > > > > > > > > > > Peft top for kernel 4.15.0-rc9 below (all 40 cores 100% cpu load with > 6.3Mpps) > > 20.96% [kernel] [k] queued_spin_lock_slowpath > 5.51% [kernel] [k] ixgbe_poll > 5.49% [kernel] [k] ixgbe_xmit_frame_ring > 4.39% [kernel] [k] do_raw_spin_lock > 4.29% [kernel] [k] sch_direct_xmit > 4.11% [kernel] [k] fib_table_lookup > 3.11% [team_mode_roundrobin] [k] rr_transmit > 2.71% [kernel] [k] __dev_queue_xmit > 2.62% [kernel] [k] __ptr_ring_peek > 2.39% [kernel] [k] skb_release_data > 2.18% [kernel] [k] dev_gro_receive > 1.75% [kernel] [k] __qdisc_run > 1.67% [kernel] [k] pfifo_fast_enqueue > 1.57% [kernel] [k] netdev_pick_tx > 1.56% [kernel] [k] page_frag_free > 1.48% [kernel] [k] ip_finish_output2 > 1.38% [kernel] [k] __slab_free > 1.36% [kernel] [k] skb_unref > 1.34% [kernel] [k] ixgbe_maybe_stop_tx > 1.30% [kernel] [k] vlan_do_receive > 1.28% [kernel] [k] pfifo_fast_dequeue > 1.23% [kernel] [k] virt_to_head_page > > > > Same configuration kernel 4.15.0-rc3 (50% cpu load on all 40 cores with > 6.3Mpps) > > 7.81% [kernel] [k] ixgbe_xmit_frame_ring > 7.61% [kernel] [k] ixgbe_poll > 7.09% [kernel] [k] do_raw_spin_lock > 5.63% [kernel] [k] fib_table_lookup > 5.19% [kernel] [k] __dev_queue_xmit > 4.38% [team_mode_roundrobin] [k] rr_transmit > 3.10% [kernel] [k] netdev_pick_tx > 2.79% [kernel] [k] skb_release_data > 2.34% [kernel] [k] dev_gro_receive > 1.99% [kernel] [k] page_frag_free > 1.96% [kernel] [k] skb_unref > 1.92% [kernel] [k] virt_to_head_page > 1.90% [kernel] [k] ixgbe_maybe_stop_tx > 1.82% [kernel] [k] vlan_do_receive > 1.74% [kernel] [k] ip_finish_output2 > 1.73% [kernel] [k] __build_skb > 1.68% [kernel] [k] __slab_free > 1.67% [kernel] [k] __netif_receive_skb_core > 1.60% [kernel] [k] inet_gro_receive > 1.49% [kernel] [k] netif_skb_features > 1.35% [kernel] [k] ip_rcv > 1.33% [kernel] [k] ipt_do_table > 1.30% [kernel] [k] compound_head > 1.26% [kernel] [k] dev_hard_start_xmit > 1.18% [kernel] [k] put_page > 1.13% [kernel] [k] tcp_gro_receive > 1.13% [kernel] [k] ip_forward > 0.99% [kernel] [k] validate_xmit_skb > 0.97% [kernel] [k] ip_route_input_rcu > 0.88% [kernel] [k] inet_lookup_ifaddr_rcu > 0.81% [kernel] [k] pfifo_fast_dequeue > 0.77% [kernel] [k] vlan_dev_hard_start_xmit > 0.64% [kernel] [k] ___slab_alloc
Please report : 1) "ethtool -l" information for your ethernet adapters. 2) IRQ configuration (grep eth /proc/interrupts) 3) tc -s qdisc show