> -----Original Message----- > From: David Marchand <david.march...@redhat.com> > Sent: Monday, January 13, 2020 3:17 PM > To: Jerin Jacob Kollanukkaran <jer...@marvell.com> > Cc: dev <dev@dpdk.org>; Thomas Monjalon <tho...@monjalon.net>; Olivier > Matz <olivier.m...@6wind.com>; Andrew Rybchenko > <arybche...@solarflare.com>; Bruce Richardson > <bruce.richard...@intel.com>; Ananyev, Konstantin > <konstantin.anan...@intel.com>; Hemant Agrawal > <hemant.agra...@nxp.com>; Shahaf Shuler <shah...@mellanox.com>; > Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Gavin Hu > <gavin...@arm.com>; vikto...@rehivetech.com; David Christensen > <d...@linux.vnet.ibm.com>; Burakov, Anatoly <anatoly.bura...@intel.com>; > dpdk stable <sta...@dpdk.org>; Kevin Traynor <ktray...@redhat.com>; Luca > Boccassi <bl...@debian.org> > Subject: [EXT] Re: [dpdk-stable] [dpdk-dev] [PATCH v3] mempool: fix mempool > obj alignment for non x86 > > External Email > > ---------------------------------------------------------------------- > On Mon, Jan 13, 2020 at 7:49 AM <jer...@marvell.com> wrote: > > > > From: Jerin Jacob <jer...@marvell.com> > > > > The existing optimize_object_size() function address the memory object > > alignment constraint on x86 for better performance. > > > > Different (micro) architecture may have different memory alignment > > constraint for better performance and it not the same as the existing > > optimize_object_size(). > > > > Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution > > based on the address and some may have a different formula. > > > > Introducing arch_mem_object_align() function to abstract the > > difference between different (micro) architectures to avoid wasting > > memory for mempool object alignment for the architecture that it is > > not required to do so. > > > > Details on the amount of memory saving: > > > > Currently, arm64 based architectures use the default (nchan=4, > > nrank=1). The worst case is for an object whose size (including > > mempool > > header) is 2 cache lines, where it is optimized to 3 cache lines (+50%). > > > > Examples for cache lines size = 64: > > orig optimized > > 64 -> 64 +0% > > 128 -> 192 +50% > > 192 -> 192 +0% > > 256 -> 320 +25% > > 320 -> 320 +0% > > 384 -> 448 +16% > > ... > > 2304 -> 2368 +2.7% (~mbuf size) > > > > Additional details: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchiv > > e.com_dev- > 40dpdk.org_msg149157.html&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ& > > > r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=VKkiHhyflsqwipCoE > MtdUR > > SXuHSq2neWGqTRmxVfjr8&s=y-LYGZ- > 2MsAfrGo3r5aADQnr2mUcsP7LxXT5XEmTuwE&e= > > > > Fixes: af75078fece3 ("first public release") > > Weird to flag this as a problem in this sha1. > x86 was the only architecture supported at the time. > Either we mark the introduction of new architectures as the point of backport, > or we remove this tag and just let Cc: sta...@dpdk.org
While committing the maintainer can take either one of the decision. No issues/opinion on this from my side. > > > Cc: sta...@dpdk.org > > It seems more an optimisation than a fix to me, but in any case, the stable > maintainers will be the judges. OK. No issues. > > > > > > Signed-off-by: Jerin Jacob <jer...@marvell.com> > > Reviewed-by: Gavin Hu <gavin...@arm.com> > > --- > > v3: > > - Change comment for MEMPOOL_F_NO_SPREAD flag as " Spreading among > > memory channels not required." (Stephen Hemminger) > > > > v2: > > - Changed the return type of arch_mem_object_align() to "unsigned int" from > > "unsigned" to fix the checkpatch issues (Olivier Matz) > > - Updated the comments for MEMPOOL_F_NO_SPREAD (Olivier Matz) > > - Update the git comments to share the memory saving details. > > > > doc/guides/prog_guide/mempool_lib.rst | 6 +++--- > > lib/librte_mempool/rte_mempool.c | 17 +++++++++++++---- > > lib/librte_mempool/rte_mempool.h | 3 ++- > > 3 files changed, 18 insertions(+), 8 deletions(-) > > > > diff --git a/doc/guides/prog_guide/mempool_lib.rst > > b/doc/guides/prog_guide/mempool_lib.rst > > index 3bb84b0a6..eea7a2906 100644 > > --- a/doc/guides/prog_guide/mempool_lib.rst > > +++ b/doc/guides/prog_guide/mempool_lib.rst > > @@ -27,10 +27,10 @@ In debug mode > (CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is > > enabled), statistics about get from/put in the pool are stored in the > > mempool > structure. > > Statistics are per-lcore to avoid concurrent access to statistics counters. > > > > -Memory Alignment Constraints > > ----------------------------- > > +Memory Alignment Constraints on X86 architecture > > +------------------------------------------------ > > Nit: afaics in the docs, x86 is preferred to X86. > > > > > > -Depending on hardware memory configuration, performance can be greatly > improved by adding a specific padding between objects. > > +Depending on hardware memory configuration on X86 architecture, > performance can be greatly improved by adding a specific padding between > objects. > > The objective is to ensure that the beginning of each object starts on a > different channel and rank in memory so that all channels are equally loaded. > > > > This is particularly true for packet buffers when doing L3 forwarding or > > flow > classification. > > > -- > David Marchand