On Wed, Sep 07, 2016 at 03:42:25PM +0300, Saeed Mahameed wrote: > For non-striding RQ configuration before this patch we had a ring > with pre-allocated SKBs and mapped the SKB->data buffers for > device. > > For robustness and better RX data buffers management, we allocate a > page per packet and build_skb around it. > > This patch (which is a prerequisite for XDP) will actually reduce > performance for normal stack usage, because we are now hitting a bottleneck > in the page allocator. A later patch of page reuse mechanism will be > needed to restore or even improve performance in comparison to the old > RX scheme. > > Packet rate performance testing was done with pktgen 64B packets on xmit > side and TC drop action on RX side. > > CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz > > Comparison is done between: > 1.Baseline, before 'net/mlx5e: Build RX SKB on demand' > 2.Build SKB with RX page cache (This patch) > > Streams Baseline Build SKB+page-cache Improvement > ----------------------------------------------------------- > 1 4.33Mpps 5.51Mpps 27% > 2 7.35Mpps 11.5Mpps 52% > 4 14.0Mpps 16.3Mpps 16% > 8 22.2Mpps 29.6Mpps 20% > 16 24.8Mpps 34.0Mpps 17%
Impressive gains for build_skb. I think it should help ip forwarding too and likely tcp_rr. tcp_stream shouldn't see any difference. If you can benchmark that along with pktgen+tc_drop it would help to better understand the impact of the changes.