On Wed, 31 Jan 2018 14:53:32 +0100 Björn Töpel <bjorn.to...@gmail.com> wrote:
> Below are the results in Mpps of the I40E NIC benchmark runs for 64 > byte packets, generated by commercial packet generator HW that is > generating packets at full 40 Gbit/s line rate. > > XDP baseline numbers without this RFC: > xdp_rxq_info --action XDP_DROP 31.3 Mpps > xdp_rxq_info --action XDP_TX 16.7 Mpps > > XDP performance with this RFC i.e. with the buffer allocator: > XDP_DROP 21.0 Mpps > XDP_TX 11.9 Mpps > > AF_PACKET V4 performance from previous RFC on 4.14-rc7: > Benchmark V2 V3 V4 V4+ZC > rxdrop 0.67 0.73 0.74 33.7 > txpush 0.98 0.98 0.91 19.6 > l2fwd 0.66 0.71 0.67 15.5 My numbers from before: V4+ZC rxdrop 35.2 Mpps txpush 20.7 Mpps l2fwd 16.9 Mpps > AF_XDP performance: > Benchmark XDP_SKB XDP_DRV XDP_DRV_ZC (all in Mpps) > rxdrop 3.3 11.6 16.9 > txpush 2.2 NA* 21.8 > l2fwd 1.7 NA* 10.4 The numbers on my system are better than your system, and compared to the my-own before results, the txpush is almost the same, and the gap between l2fwd is smaller for me. The surprise is the drop in the 'rxdrop' performance. XDP_DRV_ZC rxdrop 22.0 Mpps txpush 20.9 Mpps l2fwd 14.2 Mpps BUT is also seems you have generally slowed down the XDP_DROP results for i40e: Before: sudo ./xdp_bench01_mem_access_cost --dev i40e1 XDP_DROP 35878204 35,878,204 no_touch After this patchset: $ sudo ./xdp_bench01_mem_access_cost --dev i40e1 XDP_action pps pps-human-readable mem XDP_DROP 28992009 28,992,009 no_touch And if I read data: sudo ./xdp_bench01_mem_access_cost --dev i40e1 --read XDP_action pps pps-human-readable mem XDP_DROP 25107793 25,107,793 read BTW, see you soon in Brussels (FOSDEM18) ... -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer $ sudo ./xdpsock --rxdrop --interface=i40e1 --queue=11 [...] i40e1:11 rxdrop pps pkts 60.01 rx 22,040,099 1,322,572,352 tx 0 0 $ sudo ./xdpsock --txonly --interface=i40e1 --queue=11 [...] i40e1:11 txonly pps pkts 239.03 rx 0 0 tx 20,937,885 5,004,790,500 $ sudo ./xdpsock --l2fwd --interface=i40e1 --queue=11 [...] i40e1:11 l2fwd pps pkts 152.02 rx 14,244,719 2,165,460,044 tx 14,244,718 2,165,459,915 My before results: $ sudo ./bench_all.sh You might want to change the parameters in ./bench_all.sh i40e1 cpu5 duration 30s zc 16 i40e1 v2 rxdrop duration 29.27s rx: 62959986pkts @ 2150794.94pps i40e1 v3 rxdrop duration 29.18s rx: 68470248pkts @ 2346658.86pps i40e1 v4 rxdrop duration 29.45s rx: 68900864pkts @ 2339633.99pps i40e1 v4 rxdrop zc duration 29.36s rx: 1033722048pkts @ 35206198.62pps i40e1 v2 txonly duration 29.16s tx: 63272640pkts @ 2169632.53pps. i40e1 v3 txonly duration 29.14s tx: 62531968pkts @ 2145714.21pps. i40e1 v4 txonly duration 29.48s tx: 40587316pkts @ 1376761.87pps. i40e1 v4 txonly zc duration 29.36s tx: 608794761pkts @ 20738953.62pps. i40e1 v2 l2fwd duration 29.19s rx: 57532736pkts @ 1970885.56pps tx 57532672pkts @ 1970883.37pps. i40e1 v3 l2fwd duration 29.16s rx: 57675961pkts @ 1978149.64pps tx: 57675897pkts @ 1978147.44pps. i40e1 v4 l2fwd duration 29.51s rx: 29732pkts @ 1007.58pps tx: 28708pkts @ 972.88pps. i40e1 v4 l2fwd zc duration 29.32s rx: 497528256pkts @ 16969091.01pps tx: 497527296pkts @ 16969058.27pps.