On Fri, 07 Nov, at 10:08:04AM, Peter Zijlstra wrote: > > How is that supposed to work? You call __intel_cqm_event_count() on the > one cpu per socket, but then you use a local_add, not an atomic_add, > even though these adds can happen concurrently as per IPI broadcast. Ouch, right. That's broken.
> Also, I think smp_call_function_many() ignores the current cpu, if this > cpu happens to be the cpu for this socket, you're up some creek without > no paddle, right? OK, I didn't realise that. Yeah that sounds very problematic. I think my eyes skipped over the word "other" in the smp_call_function_many() docs, * smp_call_function_many(): Run a function on a set of other CPUs. So, the correct way to do this is to iterate over cqm_cpumask and invoke smp_call_function_single(), right? > Thirdly, there is no serialization around calling perf_event_count() [or > your pmu::count method] so you cannot temporarily put it to 0. Urgh, thanks. Good spot. I'm gonna have to think of a suitable serialisation mechanism because all the current ones are pretty heavy-handed. And of course, there's the added fun that it needs to be held across the IPIs. Perhaps a per-cache-group mutex? -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/