On Fri, May 19, 2017 at 11:10:33AM +0800, Jason Wang wrote: > > > On 2017年05月18日 11:03, Wei Wang wrote: > > On 05/17/2017 02:22 PM, Jason Wang wrote: > > > > > > > > > On 2017年05月17日 14:16, Jason Wang wrote: > > > > > > > > > > > > On 2017年05月16日 15:12, Wei Wang wrote: > > > > > > > > > > > > > > > > > > > Hi: > > > > > > > > > > > > Care to post the driver codes too? > > > > > > > > > > > OK. It may take some time to clean up the driver code before > > > > > post it out. You can first > > > > > have a check of the draft at the repo here: > > > > > https://github.com/wei-w-wang/vhost-pci-driver > > > > > > > > > > Best, > > > > > Wei > > > > > > > > Interesting, looks like there's one copy on tx side. We used to > > > > have zerocopy support for tun for VM2VM traffic. Could you > > > > please try to compare it with your vhost-pci-net by: > > > > > > We can analyze from the whole data path - from VM1's network stack to > > send packets -> VM2's > > network stack to receive packets. The number of copies are actually the > > same for both. > > That's why I'm asking you to compare the performance. The only reason for > vhost-pci is performance. You should prove it. > > > > > vhost-pci: 1-copy happen in VM1's driver xmit(), which copes packets > > from its network stack to VM2's > > RX ring buffer. (we call it "zerocopy" because there is no intermediate > > copy between VMs) > > zerocopy enabled vhost-net: 1-copy happen in tun's recvmsg, which copies > > packets from VM1's TX ring > > buffer to VM2's RX ring buffer. > > Actually, there's a major difference here. You do copy in guest which > consumes time slice of vcpu thread on host. Vhost_net do this in its own > thread. So I feel vhost_net is even faster here, maybe I was wrong.
Yes but only if you have enough CPUs. The point of vhost-pci is to put the switch in a VM and scale better with # of VMs. > > > > That being said, we compared to vhost-user, instead of vhost_net, > > because vhost-user is the one > > that is used in NFV, which we think is a major use case for vhost-pci. > > If this is true, why not draft a pmd driver instead of a kernel one? And do > you use virtio-net kernel driver to compare the performance? If yes, has OVS > dpdk optimized for kernel driver (I think not)? > > What's more important, if vhost-pci is faster, I think its kernel driver > should be also faster than virtio-net, no? If you have a vhost CPU per VCPU and can give a host CPU to each using that will be faster. But not everyone has so many host CPUs. > > > > > > > > - make sure zerocopy is enabled for vhost_net > > > > - comment skb_orphan_frags() in tun_net_xmit() > > > > > > > > Thanks > > > > > > > > > > You can even enable tx batching for tun by ethtool -C tap0 rx-frames > > > N. This will greatly improve the performance according to my test. > > > > > > > Thanks, but would this hurt latency? > > > > Best, > > Wei > > I don't see this in my test. > > Thanks