On Fri, 2019-01-25 at 08:58 +0100, Steffen Klassert wrote: > > Finally this will cause GRO/GSO for local UDP packets delivery to non > > GSO_SEGMENT sockets. That could be possibly a win or a regression: we > > save on netfilter/IP stack traversal, but we add additional work, some > > performances figures would probably help. > > I did some tests for the local receive path with netperf and iperf, but > in this case the sender that generates the packets is the bottleneck. > So the benchmarks are not that meaningful for the receive path.
I think we can use GSO on the sender if we add some additional code on the rx side - for testing purpose only - limiting the GRO aggregation to an (user controlled via sysfs) value. Beyond that, other options would be using multiple senders threads and a single rx queue and/or asymmetric CPUs. > Do you have some performance tests for UDP GRO receive? I have a bunch of ansible(!!!) scripts I can share, if you dare. They have a lot of hard-coded setting, so I'm not sure how much can be re- used outside my testbed. I also hope/wish/think/ I can allocate some time for benchmarking this on my own in the next week[s], so I'll try to post some results for the next iteration. Cheers, Paolo