Jakub, While browsing virtual interfaces in DPDK, I noticed a possible performance issue in the memif driver:
If "head" and "tail" are accessed by different lcores, they are not sufficiently far away from each other (and other hot fields) to prevent false sharing-like effects on systems with a next-N-lines hardware prefetcher, which will prefetch "tail" when fetching "head", and prefetch "head" when fetching "flags". I suggest updating the structure somewhat like this: -#define MEMIF_CACHELINE_ALIGN_MARK(mark) \ - alignas(RTE_CACHE_LINE_SIZE) RTE_MARKER mark; - -typedef struct { - MEMIF_CACHELINE_ALIGN_MARK(cacheline0); +typedef struct __rte_cache_aligned { uint32_t cookie; /**< MEMIF_COOKIE */ uint16_t flags; /**< flags */ #define MEMIF_RING_FLAG_MASK_INT 1 /**< disable interrupt mode */ + RTE_CACHE_GUARD; /* isolate head from flags */ RTE_ATOMIC(uint16_t) head; /**< pointer to ring buffer head */ - MEMIF_CACHELINE_ALIGN_MARK(cacheline1); + RTE_CACHE_GUARD; /* isolate tail from head */ RTE_ATOMIC(uint16_t) tail; /**< pointer to ring buffer tail */ - MEMIF_CACHELINE_ALIGN_MARK(cacheline2); + RTE_CACHE_GUARD; /* isolate descriptors from tail */ - memif_desc_t desc[0]; /**< buffer descriptors */ + memif_desc_t desc[]; /**< buffer descriptors */ } memif_ring_t; Med venlig hilsen / Kind regards, -Morten Brørup