On Wed, Jan 18, 2017 at 12:53 AM, Thomas Gleixner <t...@linutronix.de> wrote: > On Tue, 17 Jan 2017, Shivappa Vikas wrote: >> On Tue, 17 Jan 2017, Thomas Gleixner wrote: >> > On Fri, 6 Jan 2017, Vikas Shivappa wrote: >> > > - Issue(1): Inaccurate data for per package data, systemwide. Just prints >> > > zeros or arbitrary numbers. >> > > >> > > Fix: Patches fix this by just throwing an error if the mode is not >> > > supported. >> > > The modes supported is task monitoring and cgroup monitoring. >> > > Also the per package >> > > data for say socket x is returned with the -C <cpu on socketx> -G cgrpy >> > > option. >> > > The systemwide data can be looked up by monitoring root cgroup. >> > >> > Fine. That just lacks any comment in the implementation. Otherwise I would >> > not have asked the question about cpu monitoring. Though I fundamentaly >> > hate the idea of requiring cgroups for this to work. >> > >> > If I just want to look at CPU X why on earth do I have to set up all that >> > cgroup muck? Just because your main focus is cgroups? >> >> The upstream per cpu data is broken because its not overriding the other task >> event RMIDs on that cpu with the cpu event RMID. >> >> Can be fixed by adding a percpu struct to hold the RMID thats affinitized >> to the cpu, however then we miss all the task llc_occupancy in that - still >> evaluating it. > > The point here is that CQM is closely connected to the cache allocation > technology. After a lengthy discussion we ended up having > > - per cpu CLOSID > - per task CLOSID > > where all tasks which do not have a CLOSID assigned use the CLOSID which is > assigned to the CPU they are running on. > > So if I configure a system by simply partitioning the cache per cpu, which > is the proper way to do it for HPC and RT usecases where workloads are > partitioned on CPUs as well, then I really want to have an equaly simple > way to monitor the occupancy for that reservation. > > And looking at that from the CAT point of view, which is the proper way to > do it, makes it obvious that CQM should be modeled to match CAT. > > So lets assume the following: > > CPU 0-3 default CLOSID 0 > CPU 4 CLOSID 1 > CPU 5 CLOSID 2 > CPU 6 CLOSID 3 > CPU 7 CLOSID 3 > > T1 CLOSID 4 > T2 CLOSID 5 > T3 CLOSID 6 > T4 CLOSID 6 > > All other tasks use the per cpu defaults, i.e. the CLOSID of the CPU > they run on. > > then the obvious basic monitoring requirement is to have a RMID for each > CLOSID. > > So when I monitor CPU4, i.e. CLOSID 1 and T1 runs on CPU4, then I do not > care at all about the occupancy of T1 simply because that is running on a > seperate reservation. Trying to make that an aggregated value in the first > place is completely wrong. If you want an aggregate, which is pretty much > useless, then user space tools can generate it easily. > > The whole approach you and David have taken is to whack some desired cgroup > functionality and whatever into CQM without rethinking the overall > design. And that's fundamentaly broken because it does not take cache (and > memory bandwidth) allocation into account. > > I seriously doubt, that the existing CQM/MBM code can be refactored in any > useful way. As Peter Zijlstra said before: Remove the existing cruft > completely and start with completely new design from scratch. > > And this new design should start from the allocation angle and then add the > whole other muck on top so far its possible. Allocation related monitoring > must be the primary focus, everything else is just tinkering. >
If in this email you meant "Resource group" where you wrote "CLOSID", then please disregard my previous email. It seems like a good idea to me to have a 1:1 mapping between RMIDs and "Resource groups". The distinction matter because changing the schemata in the resource group would likely trigger a change of CLOSID, which is useful. Thanks, David