On Fri, 10 Mar 2017, David Carrillo-Cisneros wrote: > > Fine. So we need this for ONE particular use case. And if that is not well > > documented including the underlying mechanics to analyze the data then this > > will be a nice source of confusion for Joe User. > > > > I still think that this can be done differently while keeping the overhead > > small. > > > > You look at this from the existing perf mechanics which require high > > overhead context switching machinery. But that's just wrong because that's > > not how the cache and bandwidth monitoring works. > > > > Contrary to the other perf counters, CQM and MBM are based on a context > > selectable set of counters which do not require readout and reconfiguration > > when the switch happens. > > > > Especially with CAT in play, the context switch overhead is there already > > when CAT partitions need to be switched. So switching the RMID at the same > > time is basically free, if we are smart enough to do an equivalent to the > > CLOSID context switch mechanics and ideally combine both into a single MSR > > write. > > > > With that the low overhead periodic sampling can read N counters which are > > related to the monitored set and provide N separate results. For bandwidth > > the aggregation is a simple ADD and for cache residency it's pointless. > > > > Just because perf was designed with the regular performance counters in > > mind (way before that CQM/MBM stuff came around) does not mean that we > > cannot change/extend that if it makes sense. > > > > And looking at the way Cache/Bandwidth allocation and monitoring works, it > > makes a lot of sense. Definitely more than shoving it into the current mode > > of operandi with duct tape just because we can. > > > > You made a point. The use case I described can be better served with > the low overhead monitoring groups that Fenghua is working on. Then > that info can be merged with the per-CPU profile collected for non-RDT > events. > > I am ok removing the perf-like CPU filtering from the requirements.
So if I'm not missing something then ALL remaining requirements can be solved with the RDT integrated monitoring mechanics, right? Thanks, tglx