Here is a small patch set does the micro optimization, which brings about 10% performance boost in my 64B packet testing, with the following topo:
pkt generator <----> NIC <-----> Virtio NIC Patch 1 pre updates the used ring and update them in batch. It should be feasible from my understanding: there will be no issue, guest driver will not start processing them as far as we haven't updated the "used->idx" yet. I could miss something though. Patch 2 saves one check for small packets (that can be hold in one desc buf and mbuf). Patch 3 moves several frequently used fields into one cache line, for better cache sharing. Note that this patch set is based on my latest vhost ABI refactoring patchset. --- Yuanhan Liu (3): vhost: pre update used ring for Tx and Rx vhost: optimize dequeue for small packets vhost: arrange virtio_net fields for better cache sharing lib/librte_vhost/vhost-net.h | 8 +-- lib/librte_vhost/vhost_rxtx.c | 110 ++++++++++++++++++++++++------------------ 2 files changed, 68 insertions(+), 50 deletions(-) -- 1.9.0