> -----Original Message----- > From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] > Sent: Monday, April 1, 2019 12:41 PM > To: Eads, Gage <gage.e...@intel.com>; dev@dpdk.org > Cc: olivier.m...@6wind.com; arybche...@solarflare.com; Richardson, Bruce > <bruce.richard...@intel.com>; Ananyev, Konstantin > <konstantin.anan...@intel.com>; Gavin Hu (Arm Technology China) > <gavin...@arm.com>; nd <n...@arm.com>; tho...@monjalon.net; nd > <n...@arm.com> > Subject: RE: [PATCH v3 1/8] stack: introduce rte stack library > > > > > > > +static ssize_t > > > > +rte_stack_get_memsize(unsigned int count) { > > > > + ssize_t sz = sizeof(struct rte_stack); > > > > + > > > > + /* Add padding to avoid false sharing conflicts */ > > > > + sz += RTE_CACHE_LINE_ROUNDUP(count * sizeof(void *)) + > > > > + 2 * RTE_CACHE_LINE_SIZE; > > > I did not understand how the false sharing is caused and how this > > > padding is solving the issue. Verbose comments would help. > > > > The additional padding (beyond the CACHE_LINE_ROUNDUP) is to prevent > > false sharing caused by adjacent/next-line hardware prefetchers. I'll > > address this. > > > Is it not a generic problem? Or is it specific to this library?
This is not limited to this library, but it only affects systems with (enabled) next-line prefetchers, for example Intel products with an L2 adjacent cache line prefetcher[1]. For those systems, additional padding can potentially improve performance. As I understand it, this was the reason behind the 128B alignment added to rte_ring a couple years ago[2]. [1] https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors [2] http://mails.dpdk.org/archives/dev/2017-February/058613.html