On Thu, 30 Jul 2015, Tejun Heo wrote:

Hello, Vikas.

On Wed, Jul 01, 2015 at 03:21:06PM -0700, Vikas Shivappa wrote:
This patch adds a cgroup subsystem for Intel Resource Director
Technology(RDT) feature and Class of service(CLOSid) management which is
part of common RDT framework.  This cgroup would eventually be used by
all sub-features of RDT and hence be associated with the common RDT
framework as well as sub-feature specific framework.  However current
patch series only adds cache allocation sub-feature specific code.

When a cgroup directory is created it has a CLOSid associated with it
which is inherited from its parent.  The Closid is mapped to a
cache_mask which represents the L3 cache allocation to the cgroup.
Tasks belonging to the cgroup get to fill the cache represented by the
cache_mask.

First of all, I apologize for being so late.  I've been thinking about
it but the thoughts didn't quite crystalize (which isn't to say that
it's very crystal now) until recently.  If I understand correctly,
there are a couple suggested use cases for explicitly managing cache
usage.

1. Pinning known hot areas of memory in cache.

No , the cache allocation doesnt do this. (or it isn't expected to do)


2. Explicitly regulating cache usage so that cacheline allocation can
  be better than CPU itself doing it.

yes , this is what we want to do using cache alloc.


#1 isn't part of this patchset, right?  Is there any plan for working
towards this too?

cache allocation is not intended to do #1 , so we dont have to support this.


For #2, it is likely that the targeted use cases would involve threads
of a process or at least cooperating processes and having a simple API
which just goes "this (or the current) thread is only gonna use this
part of cache" would be a lot easier to use and actually beneficial.

I don't really think it makes sense to implement a fully hierarchical
cgroup solution when there isn't the basic affinity-adjusting
interface and it isn't clear whether fully hierarchical resource
distribution would be necessary especially given that the granularity
of the target resource is very coarse.

I can see that how cpuset would seem to invite this sort of usage but
cpuset itself is more of an arbitrary outgrowth (regardless of
history) in terms of resource control and most things controlled by
cpuset already have countepart interface which is readily accessible
to the normal applications.

Yes today we dont have an alternative interface - but we can always build one. We simply dont have it because till now Linux kernel just tolerated the degradation that could have occured by cache contention and this is the first interface we are building.


Given that what the feature allows is restricting usage rather than
granting anything exclusively, a programmable interface wouldn't need
to worry about complications around priviledges while being able to
reap most of the benefits in an a lot easier way.  Am I missing
something?


For #2 , from the intel_rdt cgroup we develop a framework where the user can regulate the cache allocation. A user space app could also eventually use this as underlying support and then do things on top of it depending on the enterprise or other requirements.

A typical use case would be that an application which is say continuously polluting the cache(low priority app from cache usage perspective) by bringing in data from the network (copying/streaming app) and and not letting an app to use the cache which has legitimate requirement of cache usage(high priority app).

We need to map the group of tasks to a particular class of service and way for the user to specify the cache capacity for that class of service . Also a default cgroup which could have all the tasks and use all the cache. The hierarchical interface can be used by the user as required and does not really interfere with allocating exclusive blocks of cache - all the user needs to do is make sure the masks dont overlap.
The user can configure the masks to be exclusive from others.
But note that overlapping mask provides a very easy way to share the cache usage which is what you may want to do sometimes. The current implementation can be easily extended to *enforce* exclusive capacity masks between child nodes if required. But since its expected for the super user to be using this , the usage may be limited as well or the user can still care of it like i said above. Some of the emails may have been confusing that we cannot do exclusive allocations - but thats not true all together : we can do canfigure the masks to have exclusive cache blocks for different cgroups but its just left to the user...


We did have a lot of discussions during the design and V3 if you remember and were closed on using a seperate controller ... Below is one such thread where we discussed the same . Dont want to loop throug again with this already full marathon patch :)

https://lkml.org/lkml/2015/1/27/846

quick copy from V3 thread  -
"

proposal but was removed as we did not get agreement on lkml.

the original lkml thread is here from 10/2014 for your reference -
https://lkml.org/lkml/2014/10/16/568

Yeap, I followed that thread and this being a separate controller definitely makes a lot more sense.

"



Thanks,
Vikas

Thanks.

--
tejun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to