This is a new set of patches to optimize the mergeable Rx code path. No refactoring (rewrite) was made this time. It just applies some findings from Zhihong (kudos to him!) that could improve the mergeable Rx path on the old code.
The two major factors that could improve the performance greatly are: - copy virtio header together with packet data. This could remove the buubbles between the two copy to optimize the cache access. This is implemented in patch 2 "vhost: optimize cache access" - shadow used ring update and update them at once The basic idea is to update used ring in a local buffer and flush them to the virtio used ring at once in the end. Again, this is for optimizing the cache access. This is implemented in patch 5 "vhost: shadow used ring update" The two optimizations could yield 40+% performance in micro testing and 20+% in PVP case testing with 64B packet size. Besides that, there are some tiny optimizations, such as prefetch avail ring (patch 6) and retrieve avail head once (patch 7). Note: the shadow used ring tech could also be applied to the non-mrg Rx path (and even the dequeu) path. I didn't do that for two reasons: - we already update used ring in batch in both path: it's not shadowed first though. - it's a bit too late too make many changes at this stage: RC1 is out. Please help testing. Thanks. --yliu Cc: Jianbo Liu <jianbo.liu at linaro.org> --- Yuanhan Liu (4): vhost: simplify mergeable Rx vring reservation vhost: use last avail idx for avail ring reservation vhost: prefetch avail ring vhost: retrieve avail head once Zhihong Wang (3): vhost: remove useless volatile vhost: optimize cache access vhost: shadow used ring update lib/librte_vhost/vhost.c | 13 ++- lib/librte_vhost/vhost.h | 5 +- lib/librte_vhost/vhost_user.c | 23 +++-- lib/librte_vhost/virtio_net.c | 193 +++++++++++++++++++++++++----------------- 4 files changed, 149 insertions(+), 85 deletions(-) -- 1.9.0