This series listifies part of GRO processing, in a manner which allows those packets which are not GROed (i.e. for which dev_gro_receive returns GRO_NORMAL) to be passed on to the listified regular receive path. dev_gro_receive() itself is not listified, nor the per-protocol GRO callback, since GRO's need to hold packets on lists under napi->gro_hash makes keeping the packets on other lists awkward, and since the GRO control block state of held skbs can refer only to one 'new' skb at a time.
Performance figures with this series, collected on a back-to-back pair of Solarflare sfn8522-r2 NICs with 120-second NetPerf tests. In the stats, sample size n for old and new code is 6 runs each; p is from a Welch t-test. Tests were run both with GRO enabled and disabled, the latter simulating uncoalesceable packets (e.g. due to IP or TCP options). Payload_size in all tests was 8000 bytes. BW tests use 4 streams, RR tests use 100. TCP Stream, GRO on: net-next: 9.415 Gb/s (line rate); 190% total rxcpu after #4: 9.415 Gb/s; 192% total rxcpu p_bw = 0.155; p_cpu = 0.382 TCP Stream, GRO off: net-next: 5.625 Gb/s after #4: 6.551 Gb/s 16.5% faster; p < 0.001 TCP RR, GRO on: net-next: 837.6 us after #4: 840.0 us 0.3% slower; p = 0.229 TCP RR, GRO off: net-next: 867.6 us after #4: 860.1 us 0.9% faster; p = 0.064 UDP Stream (GRO off): net-next: 7.808 Gb/s after #4: 7.848 Gb/s 0.5% slower; p = 0.144 Conclusion: * TCP b/w is 16.5% faster for traffic which cannot be coalesced by GRO. * TCP latency might be slightly improved in the same case, but it's not quite statistically significant * Both see no statistically significant change in performance with GRO active * UDP throughput might be slightly slowed (probably by patch #3) but it's not statistically significant. Note that drivers which (unlike sfc) pass UDP traffic to GRO will probably see gains here as this gives them access to bundling. Change history: v3: Rebased on latest net-next. Re-ran performance tests and added TCP_RR tests at suggestion of Eric Dumazet. Expanded changelog of patch #3. v2: Rebased on latest net-next. Removed RFC tags. Otherwise unchanged owing to lack of comments on v1. Edward Cree (4): net: introduce list entry point for GRO sfc: use batched receive for GRO net: make listified RX functions return number of good packets net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list drivers/net/ethernet/sfc/efx.c | 11 +++- drivers/net/ethernet/sfc/net_driver.h | 1 + drivers/net/ethernet/sfc/rx.c | 16 +++++- include/linux/netdevice.h | 6 +- include/net/ip.h | 4 +- include/net/ipv6.h | 4 +- net/core/dev.c | 104 ++++++++++++++++++++++++++-------- net/ipv4/ip_input.c | 39 ++++++++----- net/ipv6/ip6_input.c | 37 +++++++----- 9 files changed, 157 insertions(+), 65 deletions(-)