On Wed, 18 Jan 2017, Stephane Eranian wrote: > On Wed, Jan 18, 2017 at 12:53 AM, Thomas Gleixner <t...@linutronix.de> wrote: > >
> Your use case is specific to HPC and not Web workloads we run. Jobs run > in cgroups which may span all the CPUs of the machine. CAT may be used > to partition the cache. Cgroups would run inside a partition. There may > be multiple cgroups running in the same partition. I can understand the > value of tracking occupancy per CLOSID, however that granularity is not > enough for our use case. Inside a partition, we want to know the > occupancy of each cgroup to be able to assign blame to the top > consumer. Thus, there needs to be a way to monitor occupancy per > cgroup. I'd like to understand how your proposal would cover this use > case. The point I'm making as I explained to David is that we need to start from the allocation angle. Of course can you monitor different tasks or task groups inside an allocation. > Another important aspect is that CQM measures new allocations, thus to > get total occupancy you need to be able to monitor the thread, CPU, > CLOSid or cgroup from the beginning of execution. In the case of a cgroup > from the moment where the first thread is scheduled into the cgroup. To > do this a RMID needs to be assigned from the beginning to the entity to > be monitored. It could be by creating a CQM event just to cause an RMID > to be assigned as discussed earlier on this thread. And then if a perf > stat is launched later it will get the same RMID and report full > occupancy. But that requires the first event to remain alive, i.e., some > process must keep the file descriptor open, i.e., need some daemon or a > perf stat running in the background. That's fine, but there must be a less convoluted way to do that. The currently proposed stuff is simply horrible because it lacks any form of design and is just hacked into submission. > There are also use cases where you want CQM without necessarily enabling > CAT, for instance, if you want to know the cache footprint of a workload > to estimate how if it could be co-located with others. That's a subset of the other stuff because it's all bound to CLOSID 0. So you can again monitor tasks or tasks groups seperately. Thanks, tglx