Btw, some good news: if I run a simple dequeue workload (running rxonly in vhost-pmd and runnin txonly in guest testpmd), it yields ~50% performance boost for packet size 1518B, but this case is without NIC. And similar case as vhost<-->virtio loopback, we can see ~10% performance gains at 1518B without NIC.
Some bad news: If with the patch, I noticed a 3%-7% performance drop if zero-copy=0 compared with current DPDK(e.g: 16.07) at vhost/virtio loopback and vhost RX only + virtio TX only. Seems the patch will Impact the zero-copy=0 performance a little. -----Original Message----- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Xu, Qian Q Sent: Monday, August 29, 2016 4:33 PM To: Yuanhan Liu <yuanhan.liu at linux.intel.com>; dev at dpdk.org Cc: Maxime Coquelin <maxime.coquelin at redhat.com> Subject: Re: [dpdk-dev] [PATCH 0/6] vhost: add Tx zero copy support I just ran a PVP test, nic receive packets then forwards to vhost PMD, and virtio user interface. I didn't see any performance gains in this scenario. All packet size from 64B to 1518B performance haven't got benefit from this patchset, and in fact, the performance dropped a lot before 1280B, and similar at 1518B. The TX/RX desc setting is " txd=64, rxd=128" for TX-zero-copy enabled case. For TX-zero-copy disabled case, I just ran default testpmd(txd=512, rxd=128) without the patch. Could you help check if NIC2VM case? -----Original Message----- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yuanhan Liu Sent: Tuesday, August 23, 2016 4:11 PM To: dev at dpdk.org Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; Yuanhan Liu <yuanhan.liu at linux.intel.com> Subject: [dpdk-dev] [PATCH 0/6] vhost: add Tx zero copy support This patch set enables vhost Tx zero copy. The majority work goes to patch 4: vhost: add Tx zero copy. The basic idea of Tx zero copy is, instead of copying data from the desc buf, here we let the mbuf reference the desc buf addr directly. The major issue behind that is how and when to update the used ring. You could check the commit log of patch 4 for more details. Patch 5 introduces a new flag, RTE_VHOST_USER_TX_ZERO_COPY, to enable Tx zero copy, which is disabled by default. Few more TODOs are left, including handling a desc buf that is across two physical pages, updating release note, etc. Those will be fixed in later version. For now, here is a simple one that hopefully it shows the idea clearly. I did some quick tests, the performance gain is quite impressive. For a simple dequeue workload (running rxonly in vhost-pmd and runnin txonly in guest testpmd), it yields 40+% performance boost for packet size 1400B. For VM2VM iperf test case, it's even better: about 70% boost. --- Yuanhan Liu (6): vhost: simplify memory regions handling vhost: get guest/host physical address mappings vhost: introduce last avail idx for Tx vhost: add Tx zero copy vhost: add a flag to enable Tx zero copy examples/vhost: add an option to enable Tx zero copy doc/guides/prog_guide/vhost_lib.rst | 7 +- examples/vhost/main.c | 19 ++- lib/librte_vhost/rte_virtio_net.h | 1 + lib/librte_vhost/socket.c | 5 + lib/librte_vhost/vhost.c | 12 ++ lib/librte_vhost/vhost.h | 103 +++++++++---- lib/librte_vhost/vhost_user.c | 297 +++++++++++++++++++++++------------- lib/librte_vhost/virtio_net.c | 188 +++++++++++++++++++---- 8 files changed, 472 insertions(+), 160 deletions(-) -- 1.9.0