On Mon, Nov 14, 2016 at 09:29:48AM +0000, Dr. David Alan Gilbert wrote: > * Russell King - ARM Linux (li...@armlinux.org.uk) wrote: > > On Fri, Nov 11, 2016 at 09:23:43PM +0000, David Woodhouse wrote: > > > It's also *fairly* unlikely that the kernel in the guest has developed > > > a bug and isn't setting gso_size sanely. I'm more inclined to suspect > > > that qemu isn't properly emulating those bits. But at first glance at > > > the code, it looks like *that's* been there for the last decade too... > > > > I take issue with that, having looked at the qemu rtl8139 code: > > > > if ((txdw0 & CP_TX_LGSEN) && ip_protocol == IP_PROTO_TCP) > > { > > int large_send_mss = (txdw0 >> 16) & > > CP_TC_LGSEN_MSS_MASK; > > > > DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data > > %d " > > "frame data %d specified MSS=%d\n", ETH_MTU, > > ip_data_len, saved_size - ETH_HLEN, large_send_mss); > > > > That's the only reference to "large_send_mss" there, other than that, > > the MSS value that gets stuck into the field by 8139cp.c is completely > > unused. Instead, qemu does this: > > > > eth_payload_data = saved_buffer + ETH_HLEN; > > eth_payload_len = saved_size - ETH_HLEN; > > > > ip = (ip_header*)eth_payload_data; > > > > hlen = IP_HEADER_LENGTH(ip); > > ip_data_len = be16_to_cpu(ip->ip_len) - hlen; > > > > tcp_header *p_tcp_hdr = (tcp_header*)(eth_payload_data > > + hlen); > > int tcp_hlen = TCP_HEADER_DATA_OFFSET(p_tcp_hdr); > > > > /* ETH_MTU = ip header len + tcp header len + payload */ > > int tcp_data_len = ip_data_len - tcp_hlen; > > int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen; > > > > for (tcp_send_offset = 0; tcp_send_offset < > > tcp_data_len; tcp_send_offset += tcp_chunk_size) > > { > > > > It uses a fixed value of ETH_MTU to calculate the size of the TCP > > data chunks, and this is not surprisingly the well known: > > > > #define ETH_MTU 1500 > > > > Qemu seems to be buggy - it ignores the MSS value, and always tries to > > send 1500 byte frames. > > cc'ing in Stefan who last touched that code and Jason and Vlad who > know the net code.
CCing Igor Kovalenko who implemented "fixed for TCP segmentation offloading - removed dependency on slirp.h" in 2006. I don't actually expect him to remember this from 10 years ago though :). Looking at the history the large_send_mss variable was never used for anything beyond the debug printf. The datasheet for this NIC is here: http://realtek.info/pdf/rtl8139cp.pdf. See 9.2.1 Transmit. Does this untested patch work for you? diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index f05e59c..a3f1af5 100644 --- a/hw/net/rtl8139.c +++ b/hw/net/rtl8139.c @@ -2167,9 +2167,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s) goto skip_offload; } - /* ETH_MTU = ip header len + tcp header len + payload */ + /* MSS too small */ + if (tcp_hlen + hlen >= large_send_mss) { + goto skip_offload; + } + int tcp_data_len = ip_data_len - tcp_hlen; - int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen; + int tcp_chunk_size = large_send_mss - hlen - tcp_hlen; DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP " "data len %d TCP chunk size %d\n", ip_data_len,
signature.asc
Description: PGP signature