On Tue, Apr 17, 2018 at 4:48 PM, Sowmini Varadhan <sowmini.varad...@oracle.com> wrote: > On (04/17/18 16:23), Willem de Bruijn wrote: >> >> Assuming IPv4 with an MTU of 1500 and the maximum segment >> size of 1472, the receiver will see three datagrams with MSS of >> 1472B, 528B and 512B. > > so the recvmsg will also pass up 1472, 526, 512, right?
That's right. > If yes, how will the recvmsg differentiate between the case > (2000 byte message followed by 512 byte message) and > (1472 byte message, 526 byte message, then 512 byte message), > in other words, how are UDP message boundary semantics preserved? They aren't. This is purely an optimization to amortize the cost of repeated tx stack traversal. Unlike UFO, which would preserve the boundaries of the original larger than MTU datagram. A prime use case is bulk transfer of data. Think video streaming with QUIC. It must send MTU sized or smaller packets, but has no application-layer requirement to reconstruct large packets on the peer. That said, for negotiated flows an inverse GRO feature could conceivably be implemented to reduce rx stack traversal, too. Though due to interleaving of packets on the wire, it aggregation would be best effort, similar to TCP TSO and GRO using the PSH bit as packetization signal.