Jakub,

While browsing virtual interfaces in DPDK, I noticed a possible performance 
issue in the memif driver:

If "head" and "tail" are accessed by different lcores, they are not 
sufficiently far away from each other (and other hot fields) to prevent false 
sharing-like effects on systems with a next-N-lines hardware prefetcher, which 
will prefetch "tail" when fetching "head", and prefetch "head" when fetching 
"flags".

I suggest updating the structure somewhat like this:

-#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
-       alignas(RTE_CACHE_LINE_SIZE) RTE_MARKER mark;
-
-typedef struct {
-       MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+typedef struct __rte_cache_aligned {
        uint32_t cookie;                        /**< MEMIF_COOKIE */
        uint16_t flags;                         /**< flags */
#define MEMIF_RING_FLAG_MASK_INT 1              /**< disable interrupt mode */
+       RTE_CACHE_GUARD; /* isolate head from flags */
        RTE_ATOMIC(uint16_t) head;                      /**< pointer to ring 
buffer head */
-       MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+       RTE_CACHE_GUARD; /* isolate tail from head */
        RTE_ATOMIC(uint16_t) tail;                      /**< pointer to ring 
buffer tail */
-       MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+       RTE_CACHE_GUARD; /* isolate descriptors from tail */
-       memif_desc_t desc[0];                   /**< buffer descriptors */
+       memif_desc_t desc[];                    /**< buffer descriptors */
} memif_ring_t;


Med venlig hilsen / Kind regards,
-Morten Brørup

Reply via email to