> From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com] > Sent: Wednesday, 31 January 2024 22.09 > > On Wed, Jan 31, 2024 at 10:18:37AM +0100, Morten Brørup wrote: > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com] > > > Sent: Wednesday, 31 January 2024 00.26 > > >
[...] > > > struct rte_mempool *pool; /**< Pool from which mbuf was > > > allocated. */ > > > > > > /* second cache line - fields only used in slow path or on TX */ > > > - RTE_MARKER cacheline1 __rte_cache_min_aligned; > > > + union { > > > + void *cacheline1; > > > > The __rte_cache_min_aligned cannot be removed. It provides cache line > alignment for 32 bit platforms, where pointers in the first cache line > only use 4 byte. > > oh no i forgot i needed to figure this out before submission. now that > it's here though i could use some help / suggestions. > > the existing __rte_cache_min_aligned (and indeed standard alignas) > facilities are not of great utility when applied to anonymous unions, > further complicating things is that it also has to work with C++. > > i'll take this away and work on it some more but does anyone here have > a > suggestion on how to align this anonymous union data member to the > desired alignment *without* the union being padded to min cache line > size and as a consequence causing the rte_mbuf struct to be 3 instead > of 2 cache lines? (that's essentially the problem i need help solving). I would suggest to simply remove __rte_cache_min_aligned (and the implicit padding that comes with it), and instead conditionally (#ifdef RTE_ARCH_32) add an explicit uintptr_t padding field after each pointer field in the rte_mbuf struct's first cache line. But that would break the 32-bit ABI, so instead insert the explicit padding in the rte_mbuf struct at the end of its first cache line (where the implicit padding from __rte_cache_min_aligned is currently added), something like this: struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */ +#ifdef RTE_ARCH_32 + /* Padding to ensure correct alignment of cacheline1. */ + uintptr_t pad_buf_addr; +#if !RTE_IOVA_IN_MBUF + uintptr_t pad_next; +#endif +#endif /* RTE_ARCH_32 */ /* second cache line - fields only used in slow path or on TX */ To be on the safe side, add a static_assert to verify that offsetof(struct rte_mbuf, cacheline1) == RTE_CACHE_LINE_MIN_SIZE. This should be true for all architectures, i.e. both 64 bit and 32 bit.