On 8/8/2018 9:47 AM, Peter Zijlstra wrote:
> On Wed, Aug 08, 2018 at 03:55:54PM +0000, Luck, Tony wrote:
>>> So _why_ doesn't this work? As said by Tony, that first call should
>>> prime the caches, so the second and third calls should not generate any
>>> misses.
>>
>> How much code/data is involved? If there is a lot, then you may be unlucky
>> with cache coloring and the later parts of the "prime the caches" code path
>> may evict some lines loaded in the early parts.
> 
> Well, Reinette used perf_event_read_local() which is unfortunately quite
> a bit. But the inline I proposed is a single load and depending on
> rdpmcl() or native_read_pmc() a call to or just a single inline asm
> rdpmc instruction.
> 
> That should certainly work I think.

I am in the process of testing this variation.

Reinette


Reply via email to