The broadcast_rarp field in the virtio_net struct is checked in the
dequeue datapath regardless of whether descriptors are available or not.

As it is checked with cmpset leading to a write, false sharing on the
virtio_net struct can happen between enqueue and dequeue datapaths
regardless of whether a RARP is requested. In OVS, the issue can cause
a uni-directional performance drop of up to 15%.

Fix that by only performing the cmpset if a read of broadcast_rarp
indicates that the cmpset is likely to succeed.

Fixes: a66bcad32240 ("vhost: arrange struct fields for better cache sharing")
Cc: sta...@dpdk.org

Signed-off-by: Kevin Traynor <ktray...@redhat.com>
---

V2:
Change from fixing by moving broadcast_rarp location in virtio_net struct
to performing a read before cmpset.

 lib/librte_vhost/virtio_net.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 337470d..d0a3b11 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1057,7 +1057,19 @@ static inline bool __attribute__((always_inline))
         *
         * Check user_send_rarp() for more information.
+        *
+        * broadcast_rarp shares a cacheline in the virtio_net structure
+        * with some fields that are accessed during enqueue and
+        * rte_atomic16_cmpset() causes a write if using cmpxchg. This could
+        * result in false sharing between enqueue and dequeue.
+        *
+        * Prevent unnecessary false sharing by reading broadcast_rarp first
+        * and only performing cmpset if the read indicates it is likely to
+        * be set.
         */
-       if (unlikely(rte_atomic16_cmpset((volatile uint16_t *)
-                                        &dev->broadcast_rarp.cnt, 1, 0))) {
+
+       if (unlikely(rte_atomic16_read(&dev->broadcast_rarp) &&
+                       rte_atomic16_cmpset((volatile uint16_t *)
+                               &dev->broadcast_rarp.cnt, 1, 0))) {
+
                rarp_mbuf = rte_pktmbuf_alloc(mbuf_pool);
                if (rarp_mbuf == NULL) {
-- 
1.8.3.1

Reply via email to