The virtio_net structure is used in both enqueue and dequeue datapaths.
broadcast_rarp is checked with cmpset in the dequeue datapath regardless
of whether descriptors are available or not.

It is observed in some cases where dequeue and enqueue are performed by
different cores and no packets are available on the dequeue datapath
(i.e. uni-directional traffic), the frequent checking of broadcast_rarp
in dequeue causes performance degradation for the enqueue datapath.

In OVS the issue can cause a uni-directional performance drop of up to 15%.

Fix that by moving broadcast_rarp to a different cache line in
virtio_net struct.

Fixes: a66bcad32240 ("vhost: arrange struct fields for better cache sharing")
Cc: sta...@dpdk.org

Signed-off-by: Kevin Traynor <ktray...@redhat.com>
---
 lib/librte_vhost/vhost.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 22564f1..a254328 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -156,6 +156,4 @@ struct virtio_net {
        uint32_t                flags;
        uint16_t                vhost_hlen;
-       /* to tell if we need broadcast rarp packet */
-       rte_atomic16_t          broadcast_rarp;
        uint32_t                virt_qp_nb;
        int                     dequeue_zero_copy;
@@ -167,4 +165,6 @@ struct virtio_net {
        uint64_t                log_addr;
        struct ether_addr       mac;
+       /* to tell if we need broadcast rarp packet */
+       rte_atomic16_t          broadcast_rarp;
 
        uint32_t                nr_guest_pages;
-- 
1.8.3.1

Reply via email to