Thanks for your comment. The raw number of cache misses is just higher almost 
in every function. While hwloc is indeed useful, the assignation is exactly the 
same in all cases. What we do is to define with "-l" more or less *unused* 
cores. But the used ones run at the same place, with the same resources and the 
same configuration.

The only thing that may change is what DPDK does with those unused cores. Eg, 
allocate more unused per-core caches, etc. Shifting the allocation of other 
caches, buffers, etc for the used cores, leading to more "unlucky" alignment 
and more contention.

I'm trying to reproduce with the smallest possible modification of testpmd so 
other people might experience this.

Thanks,

Tom                                
________________________________________
De : David Christensen <d...@linux.vnet.ibm.com>
Envoyé : vendredi 25 octobre 2019 19:35
À : Tom Barbette; dev@dpdk.org
Objet : Re: [dpdk-dev] Performance impact of "declaring" more CPU cores

> The only useful observation we made is that when we are in a "bad case",
> the LLC has more cache misses.

Have you looked closely at the CPU topology on your platform, can you
provide some examples here of what you're seeing?  The hwloc package is
very useful in visualizing how your logical cores map to CPU cache.
There may be benefit is more strategically selecting the lcores you use
to reduce LLC cache mssies.

Dave

Reply via email to