On 09/23/2016 06:13 AM, Yuanhan Liu wrote: > The basic idea of dequeue zero copy is, instead of copying data from > the desc buf, here we let the mbuf reference the desc buf addr directly. > > Doing so, however, has one major issue: we can't update the used ring > at the end of rte_vhost_dequeue_burst. Because we don't do the copy > here, an update of the used ring would let the driver to reclaim the > desc buf. As a result, DPDK might reference a stale memory region. > > To update the used ring properly, this patch does several tricks: > > - when mbuf references a desc buf, refcnt is added by 1. > > This is to pin lock the mbuf, so that a mbuf free from the DPDK > won't actually free it, instead, refcnt is subtracted by 1. > > - We chain all those mbuf together (by tailq) > > And we check it every time on the rte_vhost_dequeue_burst entrance, > to see if the mbuf is freed (when refcnt equals to 1). If that > happens, it means we are the last user of this mbuf and we are > safe to update the used ring. > > - "struct zcopy_mbuf" is introduced, to associate an mbuf with the > right desc idx. > > Dequeue zero copy is introduced for performance reason, and some rough > tests show about 50% perfomance boost for packet size 1500B. For small > packets, (e.g. 64B), it actually slows a bit down (well, it could up to > 15%). That is expected because this patch introduces some extra works, > and it outweighs the benefit from saving few bytes copy. > > Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com> > --- > > v2: - use unlikely/likely for dequeue_zero_copy check, as it's not enabled > by default, as well as it has some limitations in vm2nic case. > > - handle the case that a desc buf might across 2 host phys pages > > - reset nr_zmbuf to 0 at set_vring_num > > - set the zmbuf_size to vq->size, but not the double of it. > --- > lib/librte_vhost/vhost.c | 2 + > lib/librte_vhost/vhost.h | 22 +++++- > lib/librte_vhost/vhost_user.c | 42 +++++++++- > lib/librte_vhost/virtio_net.c | 173 > +++++++++++++++++++++++++++++++++++++----- > 4 files changed, 219 insertions(+), 20 deletions(-)
Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com> Thanks, Maxime