The only useful observation we made is that when we are in a "bad case", the LLC has more cache misses.

Have you looked closely at the CPU topology on your platform, can you provide some examples here of what you're seeing? The hwloc package is very useful in visualizing how your logical cores map to CPU cache. There may be benefit is more strategically selecting the lcores you use to reduce LLC cache mssies.

Dave

Reply via email to