On Sun, 13 Aug 2017 18:58:58 +0200 Paweł Staszewski <pstaszew...@itcare.pl> wrote:
> To show some difference below comparision vlan/no-vlan traffic > > 10Mpps forwarded traffic vith no-vlan vs 6.9Mpps with vlan I'm trying to reproduce in my testlab (with ixgbe). I do see, a performance reduction of about 10-19% when I forward out a VLAN interface. This is larger than I expected, but still lower than what you reported 30-40% slowdown. [...] > >>> perf top: > >>> > >>> PerfTop: 77835 irqs/sec kernel:99.7% > >>> --------------------------------------------- > >>> > >>> 16.32% [kernel] [k] skb_dst_force > >>> 16.30% [kernel] [k] dst_release > >>> 15.11% [kernel] [k] rt_cache_valid > >>> 12.62% [kernel] [k] ipv4_mtu > >> It seems a little strange that these 4 functions are on the top I don't see these in my test. > >> > >>> 5.60% [kernel] [k] do_raw_spin_lock > >> Why is calling/taking this lock? (Use perf call-graph recording). > > can be hard to paste it here:) > > attached file The attached was very big. Please don't attach so big file on mailing lists. Next time plase share them via e.g. pastebin. The output was a capture from your terminal, which made the output more difficult to read. Hint: You can/could use perf --stdio and place it in a file instead. The output (extracted below) didn't show who called 'do_raw_spin_lock', BUT it showed another interesting thing. The kernel code __dev_queue_xmit() in might create route dst-cache problem for itself(?), as it will first call skb_dst_force() and then skb_dst_drop() when the packet is transmitted on a VLAN. static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv) { [...] /* If device/qdisc don't need skb->dst, release it right now while * its hot in this cpu cache. */ if (dev->priv_flags & IFF_XMIT_DST_RELEASE) skb_dst_drop(skb); else skb_dst_force(skb); - - Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer Extracted part of attached perf output: --5.37%--ip_rcv_finish | |--4.02%--ip_forward | | | --3.92%--ip_forward_finish | | | --3.91%--ip_output | | | --3.90%--ip_finish_output | | | --3.88%--ip_finish_output2 | | | --2.77%--neigh_connected_output | | | --2.74%--dev_queue_xmit | | | --2.73%--__dev_queue_xmit | | | |--1.66%--dev_hard_start_xmit | | | | | --1.64%--vlan_dev_hard_start_xmit | | | | | --1.63%--dev_queue_xmit | | | | | --1.62%--__dev_queue_xmit | | | | | |--0.99%--skb_dst_drop.isra.77 | | | | | | | --0.99%--dst_release | | | | | --0.55%--sch_direct_xmit | | | --0.99%--skb_dst_force | --1.29%--ip_route_input_noref | --1.29%--ip_route_input_rcu | --1.05%--rt_cache_valid