> Hi Balbin,
>
> I am very interested in the memory control of container,
> but I don't know where I can find the patches to push the
> functionalities and how to use the codes via some example
> or use cases.
>
> Could you give me some advices?
mmotm (mm on the moment == latest -mm tree) is her
Hi
> > > As Alan Cox suggested/wondered in this thread,
> > > http://lkml.org/lkml/2009/1/12/235 , this is a container group based
> > > approach
> > > to override the oom killer selection without losing all the benefits of
> > > the
> > > current oom killer heuristics and oom_adj interface.
> On Tue, 27 Jan 2009, KOSAKI Motohiro wrote:
>
> > Confused.
> >
> > As far as I know, people want the method of flexible cache treating.
> > but oom seems less flexible than userland notification.
> >
> > Why do you think notification is bad?
> >
Hi Evgeniy,
> On Mon, Jan 26, 2009 at 11:51:27PM -0800, David Rientjes
> (rient...@google.com) wrote:
> > Yeah, I proposed /dev/mem_notify being made as a client of cgroups there
> > in http://marc.info/?l=linux-kernel&m=123200623628685
> >
> > How do you replace the oom killer's capability of
Hi
> On Tue, Feb 10, 2009 at 6:41 AM, KOSAKI Motohiro
> wrote:
> > Hi
> >
> > I periodically test kernel on stress workload.
> > Unfortunately, recent kerenel don't survive >24H.
> >
> > It paniced with following stack.
> > Do you have any
new
> reference to the user_ns, which we've already put in free_user
> before scheduling remove_user_sysfs_dir().
>
> Reported-by: KOSAKI Motohiro
> Signed-off-by: Serge E. Hallyn
> Acked-by: David Howells
> Tested-by: Ingo Mol
>> I'd prefer #ifdef rather than #ifndef.
>>
>> so...
>>
>> #ifdef CONFIG_CGROUP_SWAP_RES_CTLR
>> your definition
>> #else
>> original definition
>> #endif
>>
> OK.
> I'll change it.
Thanks.
>> and vm_swap_full() isn't page granularity operation.
>> this is memory(or swap) cgroup operation.
> One option is to limit the virtual address space usage of the cgroup to ensure
> that swap usage of a cgroup will *not* exceed the specified limit. Along with
> a
> good swap controller, it should provide good control over the cgroup's memory
> usage.
unfortunately, it doesn't works in real wo
Hi,
> +#ifndef CONFIG_CGROUP_SWAP_RES_CTLR
> /* Swap 50% full? Release swapcache more aggressively.. */
> -#define vm_swap_full() (nr_swap_pages*2 < total_swap_pages)
> +#define vm_swap_full(page) (nr_swap_pages*2 < total_swap_pages)
> +#else
> +#define vm_swap_full(page) swap_cgroup_vm_swap_full
> > Have you seen any real world example of this?
>
> At the unsophisticated end, there are lots of (Fortran) HPC applications
> with very large static array declarations but only "use" a small fraction
> of that. Those users know they only need a small fraction and are happy
> to volunteer smal
ess
# pgrep fork_bomb|wc -l
98
future work
discussion cgroup guys more.
Signed-off-by: KOSAKI Motohiro <[EMAIL PROTECTED]>
CC: Li Zefan <[EMAIL PROTECTED]>
CC: Paul Menage <[EMAIL PROTECTED]>
---
include/linux/cgroup.h|5 -
inc
Hi
Thank you for careful review.
>> + struct task_cgroup *taskcg;
>> +
>> + if ((max_tasks > INT_MAX) ||
>> + (max_tasks < INT_MIN))
>
> It should be < -1 I think.
OK.
I'll fix it at next post.
>> + spin_lock(&taskcg->lock);
>> + if (max_tasks < taskcg->nr_tasks)
>> +
Hi Nishimura-san,
Thanks good point out.
>> +#include
> I don't think it's needed.
> Or, are you planning to implement this feature by using res_counter?
your are right.
my early version use res_counter, but it isn't used currently.
>> + spin_lock(&taskcg->lock);
>> + if (max_tasks <
> You are not handling task migration between groups, i.e.:
># echo $$ > /cgroup/1/tasks
># echo $$ > /cgroup/2/tasks
Oh, nice point out.
Thanks!
> Maybe you need to add this in cgroup_init_subsys():
>
> - need_forkexit_callback |= ss->fork || ss->exit;
> + need_forke
>
> - need_forkexit_callback will be read only after system boot.
> - use_task_css_set_links will be read only after it's set.
>
> And these 2 variables are checked when a new process is forked.
>
> Signed-off-by: Li Zefan <[EMAIL PROTECTED]>
nice :)
A
Hi
> Hi Kosaki,
>
> The basic idea of a task-limiting subsystem is good, thanks.
Thanks.
> > -void cgroup_fork(struct task_struct *child)
> > +int cgroup_fork(struct task_struct *child)
> > {
> > + int i;
> > + int ret;
> > +
> > + for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
> > I create new cgroup of number task restriction.
> > Please any comments!
>
> Would it make more sense to implement this as part of an rlimit subsystem,
> which also supports limiting e.g. address space, CPU time, number of open
> files, etc.? If we create one subsystem per resource, I'm afraid
-t cgroup -o task none /dev/cgroup
# mkdir /dev/cgroup/foo
# cd /dev/cgroup/foo
# ls
notify_on_release task.max_tasks task.nr_tasks tasks
# echo 100 > task.max_tasks
# echo $$ > tasks
# fork_bomb 1000 & <- try create 1000 process
# pgrep fork_bomb | wc -l
98
Signed-off-by: KOSAKI
Hi Paul,
very sorry, late responce.
> > +struct task_cgroup {
> > + struct cgroup_subsys_state css;
> > + /*
> > +* the counter to account for number of thread.
> > +*/
> > + int max_tasks;
> > + int nr_tasks;
> > +
> > + spinlock_t lock;
> > +};
>
>
Hi Randy,
sorry for so late responce.
REALLY REALLY Thank you for your professional review!
I'll merge it.
> > Index: b/init/Kconfig
> > ===
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -289,6 +289,16 @@ config CGROUP_DEBUG
Hi
> > v1 -> v2
> > o implement can_attach() and attach() method.
> > o remvoe can_fork member from cgroup_subsys.
> > instead, copy_process() call task_cgroup_can_fork() directly.
> > o fixed dead lock bug.
> > o add Documentation/contoroller/task.txt
> >
>
> I think Documentation/k
Hi Peter,
> > # mount -t cgroup -o task none /dev/cgroup
> > # mkdir /dev/cgroup/foo
> > # cd /dev/cgroup/foo
> > # ls
> > notify_on_release task.max_tasks task.nr_tasks tasks
> > # echo 100 > task.max_tasks
> > # echo $$ > tasks
> > # fork_bomb 1000 & <- try create 1000 process
> > # pgrep f
> I guess you didn't notice this comment ? :)
Agghhh, you are right.
sorry.
> >> --- a/kernel/fork.c
> >> > +++ b/kernel/fork.c
> >> > @@ -54,6 +54,7 @@
> >> > #include
> >> > #include
> >> > #include
> >> > +#include
> >> >
> >> > #include
> >> > #include
> >> > @@ -920,6 +921,8 @@
> > > fwiw I think the name sucks!
> > > cgroups are about grouping tasks, so what's a task cgroup?
> >
> > Sorry, my english skill is very low.
> > I hope restrict number of tasks.
> >
> > Do you have any name idea?
>
> Sadly I suck at naming too ;-(
>
> perhaps task_limit controller, or nr_ta
> > Bad performance on the charge/uncharge?
> >
> > The only difference I can see is that res_counter uses
> > spin_lock_irqsave()/spin_unlock_irqrestore(), and you're using plain
> > spin_lock()/spin_unlock().
> >
> > Is the overhead of a pushf/cli/popf really going to matter compared
> > with t
>> I am going to convert spinlock in task limit cgroup to atomic_t.
>> task limit cgroup has following caractatics.
>>- many write (fork, exit)
>>- few read
>>- fork() is performance sensitive systemcall.
>
> This is true, but I don't see how it can be more performance-sensi
>> Or, if you strongly want to task_limit subsystem use res_counter,
>> I can be working on improve to res_counter performance instead.
>
> I would prefer that as it would help a common cause and avoid
> duplication/maintenance overhead.
okey :)
___
Cont
CC'ed Paul Jackson
it seems typical ABBA deadlock.
I think cpuset use cgrou_lock() by mistake.
IMHO, cpuset_handle_cpuhp() sholdn't use cgroup_lock() and
shouldn't call rebuild_sched_domains().
-> #1 (cgroup_mutex){--..}:
[] __lock_acquire+0xf45/0x1040
[] lock_acquire+0x98/0xd0
Hi
> Expand the template sys_checkpoint and sys_restart to be able to dump
> and restore a single task. The task's address space may consist of only
> private, simple vma's - anonymous or file-mapped.
>
> This big patch adds a mechanism to transfer data between kernel or user
> space to and from
Hi
fork() is very important for system performance.
I worry about performance regression if this feature isn't used.
Could you mesure spawn benchmark in unixbench?
> Hi Pavel,
>
> Here is the 'hijack' patch that was mentioned during the namespaces
> part of the containers mini-summit. It's a
Hi
nice minutes!
below is just my note.
> Control Groups
> ==
>
> 1. Multiphase locking - Paul brought up his multi phase locking design and
> suggested approaches to implementing them. The problem with control groups
> currently is that transactions cannot be atomically committed. I
Hi balbir-san,
Thank you for nice minutes.
it is very helpful for non invited people (include me).
> 10. Freezer subsystem - The freezer system was discussed briefly. Serge
> mentioned the patches and wanted to collect feedback (if any) on them.
Who use it?
AFAIK the freezer is used by HPC guy
Hi
>> here's a patch to implement memory.min_usage,
>> which controls the minimum memory usage for a cgroup.
>>
>> it works similarly to mlock;
>> global memory reclamation doesn't reclaim memory from
>> cgroups whose memory usage is below the value.
>> setting it too high is a dangerous operation
> Currently the problem we are hitting is that we cannot specify pdflush
> to have background limits less than 1% of memory. I am currently
> finishing up a patch right now that adds a dirty_ratio_millis
> interface. I hope to submit the patch to LKML by the end of the week.
>
> The idea is that
> > We don't have any motivation of its interface change.
>
> We are seeing problems where we are generating a lot of dirty memory
> from asynchronous background writes while more important traffic is
> operating with DIRECT_IO. The DIRECT_IO traffic will incur high
> latency spikes as the pdflush
> Hi Kamezawa-san,
>
> I have a suggestion for css_id numbering. How about using the same
> numbering scheme as process ID for css_id instead of idr? It prevents
> the same ID from being reused quickly.
> blkio-cgroup can benefit from it. If some pages still have an old ID,
> it can somewhat preve
> On Mon, Jul 13, 2009 at 5:16 PM, Vladislav
> Buzov wrote:
> >
> > The following sequence of patches introduce memory usage limit notification
> > capability to the Memory Controller cgroup.
> >
> > This is v3 of the implementation. The major difference between previous
> > version is it is based
> On Tue, Jul 7, 2009 at 5:56 PM, KAMEZAWA
> Hiroyuki wrote:
> >
> > I know people likes to wait for file descriptor to get notification in
> > these days.
> > Can't we have "event" file descriptor in cgroup layer and make it reusable
> > for
> > other purposes ?
>
> I agree - rather than having
>
>
> Subject: [RFC][v7][PATCH 9/9]: Document clone2() syscall
>
> This gives a brief overview of the clone2() system call. We should
> eventually describe more details in existing clone(2) man page or in
> a new man page.
>
> Changelog[v7]:
> - Rename clone_with_pids() to clone2()
>
> On Wed, Nov 04, 2009 at 01:25:46PM -0800, Paul Menage wrote:
> > On Wed, Nov 4, 2009 at 9:35 AM, Matt Helsley wrote:
> > >
> > > If anything, "standardizing" the mount point(s) will likely provide a
> > > false
> > > sense of uniformity and we'll get some bad userspace scripts/tools that
> > >
> Peter Zijlstra wrote:
> > NAK, I really utterly dislike that inatomic argument. The alloc side
> > doesn't function in atomic context either. Please keep the thing
> > symmetric in that regards.
>
> Excuse me. kmalloc(GFP_KERNEL) may sleep (and therefore cannot be used in
> atomic context). Howe
> -static int notify_on_release(const struct cgroup *cgrp)
> +static inline int notify_on_release(const struct cgroup *cgrp)
> {
> return test_bit(CGRP_NOTIFY_ON_RELEASE,&cgrp->flags);
> }
>
> -static int clone_children(const struct cgroup *cgrp)
> +static inline int clone_children(const
es changed, 3 insertions(+), 3 deletions(-)
Reviewed-by: KOSAKI Motohiro
___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel
(12/11/11 9:45 AM), Glauber Costa wrote:
> There is no reason to have a flags field, and then a separate
> bool field just to indicate if 'none' subsystem were explicitly
> requested.
>
> Make it a flag
>
> Signed-off-by: Glauber Cost
IOW a /proc namespace coupled to cgroup scope would do what you want.
Now my head hurts..
Mine too. The idea is good, but too broad. Boils down to: How do you
couple them? And none of the methods I thought about seemed to make any
sense.
If we really want to have the values in /proc being opt
45 matches
Mail list logo