On 05/29/2018 11:45 AM, Maxime Coquelin wrote:
Hi, This second version fixes the feature bit check in rxvq_is_mergeable(), and remove "mergeable" from rx funcs names. No difference is seen in the benchmarks This series is preliminary work to ease the integration of packed ring layout support. But even without packed ring layout, the result is positive. First patch unify both paths, and second one is a small optimization to avoid copying batch_copy_nb_elems VQ field to/from the stack. With the series applied, I get modest performance gain for both mergeable and non-mergeable casesi (, and the gain of about 300 LoC is non negligible maintenance-wise. Rx-mrg=off benchmarks: +------------+-------+-------------+-------------+----------+ | Run | PVP | Guest->Host | Host->Guest | Loopback | +------------+-------+-------------+-------------+----------+ | v18.05-rc5 | 14.47 | 16.64 | 17.57 | 13.15 | | + series | 14.87 | 16.86 | 17.70 | 13.30 | +------------+-------+-------------+-------------+----------+ Rx-mrg=on benchmarks: +------------+------+-------------+-------------+----------+ | Run | PVP | Guest->Host | Host->Guest | Loopback | +------------+------+-------------+-------------+----------+ | v18.05-rc5 | 9.38 | 13.78 | 16.70 | 12.79 | | + series | 9.38 | 13.80 | 17.49 | 13.36 | +------------+------+-------------+-------------+----------+ Note: Even without my series, the guest->host benchmark with mergeable buffers enabled looks suspicious as it should in theory be alsmost identical as when Rx mergeable buffers are disabled. To be investigated... Maxime Coquelin (2): vhost: unify Rx mergeable and non-mergeable paths vhost: improve batched copies performance lib/librte_vhost/virtio_net.c | 376 +++++------------------------------------- 1 file changed, 37 insertions(+), 339 deletions(-)
Applied to dpdk-next-virtio. Maxime