Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation

Jerin Jacob Fri, 09 Jun 2017 10:29:54 -0700

-----Original Message-----
> Date: Fri, 9 Jun 2017 10:16:25 -0700
> From: Stephen Hemminger <[email protected]>
> To: Yerden Zhumabekov <[email protected]>
> Cc: "Ananyev, Konstantin" <[email protected]>, "Richardson,
>  Bruce" <[email protected]>, "Verkamp, Daniel"
>  <[email protected]>, "[email protected]" <[email protected]>
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Fri, 9 Jun 2017 18:47:43 +0600
> Yerden Zhumabekov <[email protected]> wrote:
> 
> > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > >  
> > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic 
> > >>>> in rte_ring.h?  
> > >>> Might be, would be good to hear opinion the author of that change.  
> > >> It gives improved performance for core-2-core transfer.  
> > > You mean empty cache-line(s) after prod/cons, correct?
> > > That's ok but why we can't keep them and whole rte_ring aligned on 
> > > cache-line boundaries?
> > > Something like that:
> > > struct rte_ring {
> > >     ...
> > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > };
> > >
> > > Konstantin
> > >  
> > 
> > I'm curious, can anyone explain, how does it actually affect 
> > performance? Maybe we can utilize it application code?
> 
> I think it is because on Intel CPU's the CPU will speculatively fetch 
> adjacent cache lines.
> If these cache lines change, then it will create false sharing.


I see. I think, In such cases it is better to abstract as conditional
compilation. The above logic has worst case cache memory
requirement if CPU is 128B CL and no speculative prefetch.

Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation

Reply via email to