On Tue, Oct 23, 2018 at 02:22:12PM +0200, Paolo Abeni wrote: > On Tue, 2018-10-23 at 14:10 +0200, Steffen Klassert wrote: > > > Some quick benchmark numbers with UDP packet forwarding > > (1460 byte packets) through two gateways: > > > > net-next: 16.4 Gbps > > > > net-next + UDP GRO: 20.3 Gbps > > uhmmm... what do you think about this speed-up ?
skb_segment() burns a lot of cycles. If I do the same test with TCP and disable HW TSO, throughput drops also down to similar values. In case of software segmentation, the skb chain appropach is likely faster because packets are not mangled. So no need to allocate skbs, no new checksum calculations, less memcpy etc. If we have an early route lookup in GRO, we could have a good guess on the offload capabilities of the outgoing device. So in case that software segmentation is likely, use the skb chaining method. If HW segmentation is likely, merge IP packets. The chaining method might be also faster on non UDP GRO enabled sockets. I'll try to implement the skb chaining method on top of this to see what we get from that.