> -----Original Message-----
> From: David Marchand <david.march...@redhat.com>
> Sent: Monday, January 13, 2020 3:17 PM
> To: Jerin Jacob Kollanukkaran <jer...@marvell.com>
> Cc: dev <dev@dpdk.org>; Thomas Monjalon <tho...@monjalon.net>; Olivier
> Matz <olivier.m...@6wind.com>; Andrew Rybchenko
> <arybche...@solarflare.com>; Bruce Richardson
> <bruce.richard...@intel.com>; Ananyev, Konstantin
> <konstantin.anan...@intel.com>; Hemant Agrawal
> <hemant.agra...@nxp.com>; Shahaf Shuler <shah...@mellanox.com>;
> Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Gavin Hu
> <gavin...@arm.com>; vikto...@rehivetech.com; David Christensen
> <d...@linux.vnet.ibm.com>; Burakov, Anatoly <anatoly.bura...@intel.com>;
> dpdk stable <sta...@dpdk.org>; Kevin Traynor <ktray...@redhat.com>; Luca
> Boccassi <bl...@debian.org>
> Subject: [EXT] Re: [dpdk-stable] [dpdk-dev] [PATCH v3] mempool: fix mempool
> obj alignment for non x86
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Mon, Jan 13, 2020 at 7:49 AM <jer...@marvell.com> wrote:
> >
> > From: Jerin Jacob <jer...@marvell.com>
> >
> > The existing optimize_object_size() function address the memory object
> > alignment constraint on x86 for better performance.
> >
> > Different (micro) architecture may have different memory alignment
> > constraint for better performance and it not the same as the existing
> > optimize_object_size().
> >
> > Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution
> > based on the address and some may have a different formula.
> >
> > Introducing arch_mem_object_align() function to abstract the
> > difference between different (micro) architectures to avoid wasting
> > memory for mempool object alignment for the architecture that it is
> > not required to do so.
> >
> > Details on the amount of memory saving:
> >
> > Currently, arm64 based architectures use the default (nchan=4,
> > nrank=1). The worst case is for an object whose size (including
> > mempool
> > header) is 2 cache lines, where it is optimized to 3 cache lines (+50%).
> >
> > Examples for cache lines size = 64:
> >   orig     optimized
> >   64    -> 64           +0%
> >   128   -> 192          +50%
> >   192   -> 192          +0%
> >   256   -> 320          +25%
> >   320   -> 320          +0%
> >   384   -> 448          +16%
> >   ...
> >   2304  -> 2368         +2.7%  (~mbuf size)
> >
> > Additional details:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchiv
> > e.com_dev-
> 40dpdk.org_msg149157.html&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
> >
> r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=VKkiHhyflsqwipCoE
> MtdUR
> > SXuHSq2neWGqTRmxVfjr8&s=y-LYGZ-
> 2MsAfrGo3r5aADQnr2mUcsP7LxXT5XEmTuwE&e=
> >
> > Fixes: af75078fece3 ("first public release")
> 
> Weird to flag this as a problem in this sha1.
> x86 was the only architecture supported at the time.
> Either we mark the introduction of new architectures as the point of backport,
> or we remove this tag and just let Cc: sta...@dpdk.org

While committing the maintainer can take either one of the decision. No 
issues/opinion on this from my side.

> 
> > Cc: sta...@dpdk.org
> 
> It seems more an optimisation than a fix to me, but in any case, the stable
> maintainers will be the judges.


OK. No issues.

> 
> 
> >
> > Signed-off-by: Jerin Jacob <jer...@marvell.com>
> > Reviewed-by: Gavin Hu <gavin...@arm.com>
> > ---
> > v3:
> > - Change comment for MEMPOOL_F_NO_SPREAD flag as " Spreading among
> > memory channels not required." (Stephen Hemminger)
> >
> > v2:
> > - Changed the return type of arch_mem_object_align() to "unsigned int" from
> >   "unsigned" to fix the checkpatch issues (Olivier Matz)
> > - Updated the comments for MEMPOOL_F_NO_SPREAD (Olivier Matz)
> > - Update the git comments to share the memory saving details.
> >
> >  doc/guides/prog_guide/mempool_lib.rst |  6 +++---
> >  lib/librte_mempool/rte_mempool.c      | 17 +++++++++++++----
> >  lib/librte_mempool/rte_mempool.h      |  3 ++-
> >  3 files changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/mempool_lib.rst
> > b/doc/guides/prog_guide/mempool_lib.rst
> > index 3bb84b0a6..eea7a2906 100644
> > --- a/doc/guides/prog_guide/mempool_lib.rst
> > +++ b/doc/guides/prog_guide/mempool_lib.rst
> > @@ -27,10 +27,10 @@ In debug mode
> (CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is
> > enabled),  statistics about get from/put in the pool are stored in the 
> > mempool
> structure.
> >  Statistics are per-lcore to avoid concurrent access to statistics counters.
> >
> > -Memory Alignment Constraints
> > -----------------------------
> > +Memory Alignment Constraints on X86 architecture
> > +------------------------------------------------
> 
> Nit: afaics in the docs, x86 is preferred to X86.
> 
> 
> >
> > -Depending on hardware memory configuration, performance can be greatly
> improved by adding a specific padding between objects.
> > +Depending on hardware memory configuration on X86 architecture,
> performance can be greatly improved by adding a specific padding between
> objects.
> >  The objective is to ensure that the beginning of each object starts on a
> different channel and rank in memory so that all channels are equally loaded.
> >
> >  This is particularly true for packet buffers when doing L3 forwarding or 
> > flow
> classification.
> 
> 
> --
> David Marchand

Reply via email to