From: Edward Cree <ec...@solarflare.com>
Date: Tue, 6 Aug 2019 14:52:06 +0100

> This series listifies part of GRO processing, in a manner which allows those
>  packets which are not GROed (i.e. for which dev_gro_receive returns
>  GRO_NORMAL) to be passed on to the listified regular receive path.
> dev_gro_receive() itself is not listified, nor the per-protocol GRO
>  callback, since GRO's need to hold packets on lists under napi->gro_hash
>  makes keeping the packets on other lists awkward, and since the GRO control
>  block state of held skbs can refer only to one 'new' skb at a time.
> Instead, when napi_frags_finish() handles a GRO_NORMAL result, stash the skb
>  onto a list in the napi struct, which is received at the end of the napi
>  poll or when its length exceeds the (new) sysctl net.core.gro_normal_batch.
> 
> Performance figures with this series, collected on a back-to-back pair of
>  Solarflare sfn8522-r2 NICs with 120-second NetPerf tests.  In the stats,
>  sample size n for old and new code is 6 runs each; p is from a Welch t-test.
> Tests were run both with GRO enabled and disabled, the latter simulating
>  uncoalesceable packets (e.g. due to IP or TCP options).  The receive side
>  (which was the device under test) had the NetPerf process pinned to one CPU,
>  and the device interrupts pinned to a second CPU.  CPU utilisation figures
>  (used in cases of line-rate performance) are summed across all CPUs.
> net.core.gro_normal_batch was left at its default value of 8.
 ...
> The above results are fairly mixed, and in most cases not statistically
>  significant.  But I think we can roughly conclude that the series
>  marginally improves non-GROable throughput, without hurting latency
>  (except in the large-payload busy-polling case, which in any case yields
>  horrid performance even on net-next (almost triple the latency without
>  busy-poll).  Also, drivers which, unlike sfc, pass UDP traffic to GRO
>  would expect to see a benefit from gaining access to batching.
> 
> Changed in v3:
>  * gro_normal_batch sysctl now uses SYSCTL_ONE instead of &one
>  * removed RFC tags (no comments after a week means no-one objects, right?)
> 
> Changed in v2:
>  * During busy poll, call gro_normal_list() to receive batched packets
>    after each cycle of the napi busy loop.  See comments in Patch #3 for
>    complications of doing the same in busy_poll_stop().
> 
> [1]: Cohen 1959, doi: 10.1080/00401706.1959.10489859

Series applied, thanks Edward.

Reply via email to