This is an update to my generic containers patch. The major change is support for multiple hierarchies of containers (up to a limit specified at build time).
- The mount options passed when mounting a container filesystem indicate the set of controllers/subsystems that are wanted in the hierarchy - e.g. "mount -t container -o cpuset,numtasks container /foo" - Default is to try to mount all subsystems - if a hierarchy with the requested set of subsystems already exists then its superblock is reused - otherwise (as long as all the requested subsystems are currently not in use in any hierarchy) a new hierarchy is created. - hierarchies with more than one container (i.e. with any children of the root container) persist even when unmounted; - /proc/containers shows current hierarchy/subsystem details - /proc/<pid>/container shows one line for each active hierarchy Other changes include: - ported to 2.6.19-rc5 - per-subsystem/per-container state is no longer just a void * - it has some state maintained by the container framework (to handle moving subsystems in and out of hierarchies when they are created/released) Note that this hasn't yet undergone intensive testing following the multi-hierarchy introduction, but I wanted to get the basic idea out for comments. TODOs include: - figuring out a nice way to handle release notifications now that there are multiple hierarchies ------------------------------------- There have recently been various proposals floating around for resource management/accounting subsystems in the kernel, including Res Groups, User BeanCounters and others. These all need the basic abstraction of being able to group together multiple processes in an aggregate, in order to track/limit the resources permitted to those processes, and all implement this grouping in different ways. Already existing in the kernel is the cpuset subsystem; this has a process grouping mechanism that is mature, tested, and well documented (particularly with regards to synchronization rules). This patchset extracts the process grouping code from cpusets into a generic container system, and makes the cpusets code a client of the container system. It also provides a very simple additional container subsystem to do per-container CPU usage accounting; this is primarily to demonstrate use of the container subsystem API, but is useful in its own right. The change is implemented in five stages plus an additional example patch: 1) extract the process grouping code from cpusets into a standalone system 2) remove the process grouping code from cpusets and hook into the container system 3) convert the container system to present a generic multi-hierarchy API, and make cpusets a client of that API 4) add a simple CPU accounting container subsystem as an example 5) add support for fork/exit callbacks iff some subsystem is interested in them 6) example of implementing ResGroups and its numtasks controller over generic containers - not intended to be applied with this patch set The intention is that the various resource management efforts can also become container clients, with the result that: - the userspace APIs are (somewhat) normalised - it's easier to test out e.g. the ResGroups CPU controller in conjunction with the UBC memory controller - the additional kernel footprint of any of the competing resource management systems is substantially reduced, since it doesn't need to provide process grouping/containment, hence improving their chances of getting into the kernel Signed-off-by: Paul Menage <[EMAIL PROTECTED]> -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/