I worked on the same issue a few months back. I rebased my proof-of-concept code to the current net-next and posted an RFC patch a moment ago.
I have zero experience on QEMU feature negotiation or extending the virtio_net spec. Since the virtio_net handling code is now all done using shared code, this should work for macvtap as well, not sure if macvtap needs some control plane changes. I posted a separate patch to make af_packet also use the shared infra for virtio_net handling yesterday. My RFC patch assumes that af_packet need not be touched, i.e., assumes the af_packet patch is applied, even though the patches apply to net-next in either order. Jarno > On Nov 16, 2016, at 11:27 PM, Jason Wang <jasow...@redhat.com> wrote: > > > > On 2016年11月17日 09:31, Zhangming (James, Euler) wrote: >> On 2016年11月15日 11:28, Jason Wang wrote: >>> On 2016年11月10日 14:19, Zhangming (James, Euler) wrote: >>>> On 2016年11月09日 15:14, Jason Wang wrote: >>>>> On 2016年11月08日 19:58, Zhangming (James, Euler) wrote: >>>>>> On 2016年11月08日 19:17, Jason Wang wrote: >>>>>> >>>>>>> On 2016年11月08日 19:13, Jason Wang wrote: >>>>>>>> Cc Michael >>>>>>>> >>>>>>>> On 2016年11月08日 16:34, Zhangming (James, Euler) wrote: >>>>>>>>> In container scenario, OVS is installed in the Virtual machine, >>>>>>>>> and all the containers connected to the OVS will communicated >>>>>>>>> through VXLAN encapsulation. >>>>>>>>> >>>>>>>>> By now, virtio_net does not support TSO offload for VXLAN >>>>>>>>> encapsulated TSO package. In this condition, the performance is >>>>>>>>> not good, sender is bottleneck >>>>>>>>> >>>>>>>>> I googled this scenario, but I didn’t find any information. Will >>>>>>>>> virtio_net support VXLAN encapsulation package TSO offload later? >>>>>>>>> >>>>>>>> Yes and for both sender and receiver. >>>>>>>> >>>>>>>>> My idea is virtio_net open encapsulated TSO offload, and >>>>>>>>> transport encapsulation info to TUN, TUN will parse the info and >>>>>>>>> build skb with encapsulation info. >>>>>>>>> >>>>>>>>> OVS or kernel on the host should be modified to support this. >>>>>>>>> Using this method, the TCP performance aremore than 2x as before. >>>>>>>>> >>>>>>>>> Any advice and suggestions for this idea or new idea will be >>>>>>>>> greatly appreciated! >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> James zhang >>>>>>>>> >>>>>>>> Sounds very good. And we may also need features bits >>>>>>>> (VIRTIO_NET_F_GUEST|HOST_GSO_X) for this. >>>>>>>> >>>>>>>> This is in fact one of items in networking todo list. (See >>>>>>>> http://www.linux-kvm.org/page/NetworkingTodo). While at it, we'd >>>>>>>> better support not only VXLAN but also other tunnels. >>>>>>> Cc Vlad who is working on extending virtio-net headers. >>>>>>> >>>>>>>> We can start with the spec work, or if you've already had some >>>>>>>> bits you can post them as RFC for early review. >>>>>>>> >>>>>>>> Thanks >>>>>> Below is my demo code >>>>>> Virtio_net.c >>>>>> static int virtnet_probe(struct virtio_device *vdev), add belows codes: >>>>>> if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) || >>>>>> // avoid gso segment, it should be negotiation >>>>>> later, because in the demo I reuse num_buffers. >>>>>> virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { >>>>>> dev->hw_enc_features |= NETIF_F_TSO; >>>>>> dev->hw_enc_features |= NETIF_F_ALL_CSUM; >>>>>> dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL; >>>>>> dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM; >>>>>> dev->hw_enc_features |= >>>>>> NETIF_F_GSO_TUNNEL_REMCSUM; >>>>>> >>>>>> dev->features |= NETIF_F_GSO_UDP_TUNNEL; >>>>>> dev->features |= NETIF_F_GSO_UDP_TUNNEL_CSUM; >>>>>> dev->features |= NETIF_F_GSO_TUNNEL_REMCSUM; >>>>>> } >>>>>> >>>>>> static int xmit_skb(struct send_queue *sq, struct sk_buff *skb), add >>>>>> below to pieces of codes >>>>>> >>>>>> if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) >>>>>> hdr->hdr.gso_type |= VIRTIO_NET_HDR_GSO_TUNNEL; >>>>>> if (skb_shinfo(skb)->gso_type & >>>>>> SKB_GSO_UDP_TUNNEL_CSUM) >>>>>> hdr->hdr.gso_type |= >>>>>> VIRTIO_NET_HDR_GSO_TUNNEL_CSUM; >>>>>> if (skb_shinfo(skb)->gso_type & SKB_GSO_TUNNEL_REMCSUM) >>>>>> hdr->hdr.gso_type |= >>>>>> VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM; >>>>>> >>>>>> if (skb->encapsulation && skb_is_gso(skb)) { >>>>>> inner_mac_len = skb_inner_network_header(skb) - >>>>>> skb_inner_mac_header(skb); >>>>>> tnl_len = skb_inner_mac_header(skb) - >>>>>> skb_mac_header(skb); >>>>>> if ( !(inner_mac_len >> DATA_LEN_SHIFT) && !(tnl_len >>>>>> >> DATA_LEN_SHIFT) ) { >>>>>> hdr->hdr.flags |= >>>>>> VIRTIO_NET_HDR_F_ENCAPSULATION; >>>>>> hdr->num_buffers = (__virtio16)((inner_mac_len >>>>>> << DATA_LEN_SHIFT) | tnl_len); //we reuse num_buffers for >>>>>> simple , we should add extend member for later. >>>>>> } else >>>>>> hdr->num_buffers = 0; >>>>>> } >>>>>> >>>>>> Tun.c >>>>>> if (memcpy_fromiovecend((void *)&hdr, iv, offset, >>>>>> tun->vnet_hdr_sz)) //read header with negotiation length >>>>>> return -EFAULT; >>>>>> >>>>>> if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL) >>>>>> //set tunnel gso info >>>>>> skb_shinfo(skb)->gso_type |= >>>>>> SKB_GSO_UDP_TUNNEL; >>>>>> if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_CSUM) >>>>>> skb_shinfo(skb)->gso_type |= >>>>>> SKB_GSO_UDP_TUNNEL_CSUM; >>>>>> if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM) >>>>>> skb_shinfo(skb)->gso_type |= >>>>>> SKB_GSO_TUNNEL_REMCSUM; >>>>>> >>>>>> if (hdr.flags & VIRTIO_NET_HDR_F_ENCAPSULATION) { >>>>>> //read tunnel info from header and set >>>>>> to built skb. >>>>>> tnl_len = tun16_to_cpu(tun, hdr.num_buffers) & >>>>>> TUN_TNL_LEN_MASK; >>>>>> payload_mac_len = tun16_to_cpu(tun, hdr.num_buffers) >>>>>> >> TUN_DATA_LEN_SHIFT; >>>>>> mac_len = skb_network_header(skb) - >>>>>> skb_mac_header(skb); >>>>>> skb_set_inner_mac_header(skb, tnl_len - mac_len); >>>>>> skb_set_inner_network_header(skb, tnl_len + >>>>>> payload_mac_len - mac_len); >>>>>> skb->encapsulation = 1; >>>>>> } >>>>>> >>>>>> >>>>> Something like this, and you probably need do something more: >>>>> >>>>> - use net-next.git to generate the patch (for the latest code) >>>>> - add feature negotiation >>>>> - tun/macvtap/qemu patches for this, you can start with tun/macvtap >>>>> patches >>>>> - support for all other SKB_GSO_* types which is not supported >>>>> - use a new field instead of num_buffers >>>>> - a virtio spec patch to describe the support for encapsulation >>>>> offload >>>>> >>>>> Thanks >>>> Thank you for your advice, I will start it right now. >>>> >>>> Thanks >>> Cool, one more question: while at it, I think you may want to add support >>> for dpdk too? >>> >>> Thanks >> Do you mean that the patch should be compatible with virtio pmd, or give >> virtio pmd patch? >> >> Thanks > > I mean it's better to prepare patches for both virtio pmd and dpdk. > > Thanks