On Tue, 2018-04-17 at 16:00 -0400, Willem de Bruijn wrote: > From: Willem de Bruijn <will...@google.com> > > Segmentation offload reduces cycles/byte for large packets by > amortizing the cost of protocol stack traversal. > > This patchset implements GSO for UDP. A process can concatenate and > submit multiple datagrams to the same destination in one send call > by setting socket option SOL_UDP/UDP_SEGMENT with the segment size, > or passing an analogous cmsg at send time. > > The stack will send the entire large (up to network layer max size) > datagram through the protocol layer. At the GSO layer, it is broken > up in individual segments. All receive the same network layer header > and UDP src and dst port. All but the last segment have the same UDP > header, but the last may differ in length and checksum.
This is interesting, thanks for sharing! I have some local patches somewhere implementing UDP GRO, but I never tried to upstream them, since I lacked the associated GSO and I thought that the use-case was not too relevant. Given that your use-case is a connected socket - no per packet route lookup - how does GSO performs compared to plain sendmmsg()? Have you considered using and/or improving the latter? When testing with Spectre/Meltdown mitigation in places, I expect that the most relevant part of the gain is due to the single syscall per burst. Cheers, Paolo