> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> 
> On 2024-08-28 23:04, Morten Brørup wrote:
> > Jakub,
> >
> > While browsing virtual interfaces in DPDK, I noticed a possible performance
> issue in the memif driver:
> >
> > If "head" and "tail" are accessed by different lcores, they are not
> sufficiently far away from each other (and other hot fields) to prevent false
> sharing-like effects on systems with a next-N-lines hardware prefetcher, which
> will prefetch "tail" when fetching "head", and prefetch "head" when fetching
> "flags".
> >
> > I suggest updating the structure somewhat like this:
> >
> > -#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
> > -   alignas(RTE_CACHE_LINE_SIZE) RTE_MARKER mark;
> > -
> > -typedef struct {
> > -   MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
> > +typedef struct __rte_cache_aligned {
> >     uint32_t cookie;                        /**< MEMIF_COOKIE */
> >     uint16_t flags;                         /**< flags */
> > #define MEMIF_RING_FLAG_MASK_INT 1          /**< disable interrupt mode */
> > +   RTE_CACHE_GUARD; /* isolate head from flags */
> 
> Wouldn't it be better to cache align the 'head' (or cache-aligned 'head'
> *and* add a RTE_CACHE_GUARD)? In other words, isn't the purpose of
> RTE_CACHE_GUARD to provide zero or more cache line of extra padding,
> rather than a mechanism to avoid same-cache line false sharing?

IMO the general purpose of RTE_CACHE_GUARD is to prevent false cache line 
sharing; both sharing of the same cache line (on systems with or without 
speculative prefetching) and sharing of the next cache lines (on systems with 
speculative prefetching).

RTE_CACHE_GUARD provides two things:
1. Zero or more bytes of padding up to cache alignment, which prevents 
same-cache line sharing. This effectively cache aligns the field that follows 
the RTE_CACHE_GUARD, here the "head".
2. Zero or more cache lines of extra padding (configured by 
RTE_CACHE_GUARD_LINES in rte_config.h), which prevents sharing of the next 
cache lines on systems with speculative prefetching.

My description failed to mention the reason for the RTE_CACHE_GUARD between 
"flags" and "head":

The lcore updating "tail" also reads "flags", and if reading "flags" causes 
that lcore to prefetch the next cache line, it will thereby read the cache line 
holding "head", causing false cache line sharing with the other lcore updating 
"head".

> 
> >     RTE_ATOMIC(uint16_t) head;                      /**< pointer to ring 
> > buffer head
> */
> > -   MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
> > +   RTE_CACHE_GUARD; /* isolate tail from head */
> >     RTE_ATOMIC(uint16_t) tail;                      /**< pointer to ring 
> > buffer tail
> */
> > -   MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
> > +   RTE_CACHE_GUARD; /* isolate descriptors from tail */
> > -   memif_desc_t desc[0];                   /**< buffer descriptors */
> > +   memif_desc_t desc[];                    /**< buffer descriptors */
> > } memif_ring_t;
> >
> >
> > Med venlig hilsen / Kind regards,
> > -Morten Brørup
> >

Reply via email to