I see a significant performance improvement with these patches, around 5% at 64 bytes.
The one patch that didn't give any performance boost for me was "vhost: arrange virtio_net fields for better cache sharing". Tested-by: Rich Lane <rich.lane at bigswitch.com> On Mon, May 2, 2016 at 5:46 PM, Yuanhan Liu <yuanhan.liu at linux.intel.com> wrote: > Here is a small patch set does the micro optimization, which brings about > 10% performance boost in my 64B packet testing, with the following topo: > > pkt generator <----> NIC <-----> Virtio NIC > > Patch 1 pre updates the used ring and update them in batch. It should be > feasible from my understanding: there will be no issue, guest driver will > not start processing them as far as we haven't updated the "used->idx" > yet. I could miss something though. > > Patch 2 saves one check for small packets (that can be hold in one desc > buf and mbuf). > > Patch 3 moves several frequently used fields into one cache line, for > better cache sharing. > > Note that this patch set is based on my latest vhost ABI refactoring > patchset. > > > --- > Yuanhan Liu (3): > vhost: pre update used ring for Tx and Rx > vhost: optimize dequeue for small packets > vhost: arrange virtio_net fields for better cache sharing > > lib/librte_vhost/vhost-net.h | 8 +-- > lib/librte_vhost/vhost_rxtx.c | 110 > ++++++++++++++++++++++++------------------ > 2 files changed, 68 insertions(+), 50 deletions(-) > > -- > 1.9.0 > >