Re: Virtio_net support vxlan encapsulation package TSO offload discuss

Jason Wang Wed, 16 Nov 2016 23:28:49 -0800


On 2016年11月17日 09:31, Zhangming (James, Euler) wrote:

On 2016年11月15日 11:28, Jason Wang wrote:

On 2016年11月10日 14:19, Zhangming (James, Euler) wrote:

On 2016年11月09日 15:14, Jason Wang wrote:

On 2016年11月08日 19:58, Zhangming (James, Euler) wrote:

On 2016年11月08日 19:17, Jason Wang wrote:

On 2016年11月08日 19:13, Jason Wang wrote:

Cc Michael

On 2016年11月08日 16:34, Zhangming (James, Euler) wrote:

In container scenario, OVS is installed in the Virtual machine,
and all the containers connected to the OVS will communicated
through VXLAN encapsulation.

By now, virtio_net does not support TSO offload for VXLAN
encapsulated TSO package. In this condition, the performance is
not good, sender is bottleneck

I googled this scenario, but I didn’t find any information. Will
virtio_net support VXLAN encapsulation package TSO offload later?

Yes and for both sender and receiver.

My idea is virtio_net open encapsulated TSO offload, and
transport encapsulation info to TUN, TUN will parse the info and
build skb with encapsulation info.

OVS or kernel on the host should be modified to support this.
Using this method, the TCP performance aremore than 2x as before.

Any advice and suggestions for this idea or new idea will be
greatly appreciated!

Best regards,

      James zhang

Sounds very good. And we may also need features bits
(VIRTIO_NET_F_GUEST|HOST_GSO_X) for this.

This is in fact one of items in networking todo list. (See
http://www.linux-kvm.org/page/NetworkingTodo). While at it, we'd
better support not only VXLAN but also other tunnels.

Cc Vlad who is working on extending virtio-net headers.

We can start with the spec work, or if you've already had some
bits you can post them as RFC for early review.

Thanks

Below is my demo code
Virtio_net.c
static int virtnet_probe(struct virtio_device *vdev), add belows codes:
           if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||              
                // avoid gso segment, it should be negotiation later, because 
in the demo I reuse num_buffers.
               virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
                   dev->hw_enc_features |= NETIF_F_TSO;
                   dev->hw_enc_features |= NETIF_F_ALL_CSUM;
                   dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL;
                   dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
                   dev->hw_enc_features |=
NETIF_F_GSO_TUNNEL_REMCSUM;

                   dev->features |= NETIF_F_GSO_UDP_TUNNEL;
                   dev->features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
                   dev->features |= NETIF_F_GSO_TUNNEL_REMCSUM;
           }

static int xmit_skb(struct send_queue *sq, struct sk_buff *skb), add
below to pieces of codes

                   if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL)
                           hdr->hdr.gso_type |= VIRTIO_NET_HDR_GSO_TUNNEL;
                   if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)
                           hdr->hdr.gso_type |= VIRTIO_NET_HDR_GSO_TUNNEL_CSUM;
                   if (skb_shinfo(skb)->gso_type & SKB_GSO_TUNNEL_REMCSUM)
                           hdr->hdr.gso_type |=
VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM;

           if (skb->encapsulation && skb_is_gso(skb)) {
                   inner_mac_len = skb_inner_network_header(skb) - 
skb_inner_mac_header(skb);
                   tnl_len = skb_inner_mac_header(skb) - skb_mac_header(skb);
                   if ( !(inner_mac_len >> DATA_LEN_SHIFT) && !(tnl_len >> 
DATA_LEN_SHIFT) ) {
                           hdr->hdr.flags |= VIRTIO_NET_HDR_F_ENCAPSULATION;
                           hdr->num_buffers = (__virtio16)((inner_mac_len << 
DATA_LEN_SHIFT) | tnl_len);               //we reuse num_buffers for simple , we should 
add extend member for later.
                   }  else
                           hdr->num_buffers = 0;
           }

Tun.c
                   if (memcpy_fromiovecend((void *)&hdr, iv, offset, 
tun->vnet_hdr_sz))          //read header with negotiation length
                           return -EFAULT;

                   if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL)                
                    //set tunnel gso info
                           skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL;
                   if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_CSUM)
                           skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL_CSUM;
                   if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM)
                           skb_shinfo(skb)->gso_type |=
SKB_GSO_TUNNEL_REMCSUM;

           if (hdr.flags & VIRTIO_NET_HDR_F_ENCAPSULATION) {                    
                            //read tunnel info from header and set to built skb.
                   tnl_len = tun16_to_cpu(tun, hdr.num_buffers) & 
TUN_TNL_LEN_MASK;
                   payload_mac_len = tun16_to_cpu(tun, hdr.num_buffers) >> 
TUN_DATA_LEN_SHIFT;
                   mac_len = skb_network_header(skb) - skb_mac_header(skb);
                   skb_set_inner_mac_header(skb, tnl_len - mac_len);
                   skb_set_inner_network_header(skb, tnl_len + payload_mac_len 
- mac_len);
                   skb->encapsulation = 1;
           }

Something like this, and you probably need do something more:

- use net-next.git to generate the patch (for the latest code)
- add feature negotiation
- tun/macvtap/qemu patches for this, you can start with tun/macvtap
patches
- support for all other SKB_GSO_* types which is not supported
- use a new field instead of num_buffers
- a virtio spec patch to describe the support for encapsulation
offload

Thanks

Thank you for your advice, I will start it right now.

Thanks

Cool, one more question: while at it, I think you may want to add support for 
dpdk too?

Thanks

Do you mean that the patch should be compatible with virtio pmd, or give virtio 
pmd patch?

Thanks


I mean it's better to prepare patches for both virtio pmd and dpdk.

Thanks

Re: Virtio_net support vxlan encapsulation package TSO offload discuss

Reply via email to