[Devel] Re: [PATCH v5 09/14] memcg: kmem accounting lifecycle management

2012-10-17 Thread Michal Hocko
On Wed 17-10-12 16:28:38, David Rientjes wrote: > On Tue, 16 Oct 2012, Glauber Costa wrote: [...] > > + > > +static void memcg_kmem_mark_dead(struct mem_cgroup *memcg) > > +{ > > + if (test_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_accounted)) > > + set_bit(KMEM_ACCOUNTED_DEAD, &memcg->km

[Devel] Re: [PATCH v5 09/14] memcg: kmem accounting lifecycle management

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1182188..e24b388 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -344,6 +344,7 @@ struct mem_cgroup { > /* internal only representation about the status of kmem accounting. */ > enum {

[Devel] Re: [PATCH v5 08/14] res_counter: return amount of charges after res_counter_uncharge

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > It is useful to know how many charges are still left after a call to > res_counter_uncharge. While it is possible to issue a res_counter_read > after uncharge, this can be racy. > > If we need, for instance, to take some action when the counters drop >

[Devel] Re: [PATCH v5 07/14] mm: Allocate kernel pages to the right memcg

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > When a process tries to allocate a page with the __GFP_KMEMCG flag, the > page allocator will call the corresponding memcg functions to validate > the allocation. Tasks in the root memcg can always proceed. > > To avoid adding markers to the page - and

[Devel] Re: [PATCH v5 06/14] memcg: kmem controller infrastructure

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 8d9489f..303a456 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -21,6 +21,7 @@ > #define _LINUX_MEMCONTROL_H > #include > #include > +#in

[Devel] Re: [PATCH v5 14/14] Add documentation about the kmem controller

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:51 +0400 Glauber Costa wrote: > +Kernel memory won't be accounted at all until limit on a group is set. This > +allows for existing setups to continue working without disruption. The limit > +cannot be set if the cgroup have children, or if there are already tasks in >

[Devel] Re: [PATCH v5 13/14] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:50 +0400 Glauber Costa wrote: > @@ -146,7 +146,7 @@ void __weak arch_release_thread_info(struct thread_info > *ti) > static struct thread_info *alloc_thread_info_node(struct task_struct *tsk, > int node) > { > - stru

[Devel] Re: [PATCH v5 11/14] memcg: allow a memcg with kmem charges to be destructed.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:48 +0400 Glauber Costa wrote: > Because the ultimate goal of the kmem tracking in memcg is to track slab > pages as well, It is? For a major patchset such as this, it's pretty important to discuss such long-term plans in the top-level discussion. Covering things such

[Devel] Re: [PATCH v5 07/14] mm: Allocate kernel pages to the right memcg

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:44 +0400 Glauber Costa wrote: > When a process tries to allocate a page with the __GFP_KMEMCG flag, the > page allocator will call the corresponding memcg functions to validate > the allocation. Tasks in the root memcg can always proceed. > > To avoid adding markers to

[Devel] Re: [PATCH v5 06/14] memcg: kmem controller infrastructure

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:43 +0400 Glauber Costa wrote: > This patch introduces infrastructure for tracking kernel memory pages to > a given memcg. This will happen whenever the caller includes the flag > __GFP_KMEMCG flag, and the task belong to a memcg other than the root. > > In memcontrol.h

[Devel] Re: [PATCH v5 04/14] kmem accounting basic infrastructure

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:41 +0400 Glauber Costa wrote: > This patch adds the basic infrastructure for the accounting of kernel > memory. To control that, the following files are created: > > * memory.kmem.usage_in_bytes > * memory.kmem.limit_in_bytes > * memory.kmem.failcnt gargh. "failcnt

[Devel] Re: [PATCH v5 01/14] memcg: Make it possible to use the stock for more than one page.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:38 +0400 Glauber Costa wrote: > From: Suleiman Souhlal > > We currently have a percpu stock cache scheme that charges one page at a > time from memcg->res, the user counter. When the kernel memory > controller comes into play, we'll need to charge more than that. > >

[Devel] Re: [PATCH v5 00/14] kmem controller for memcg.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:37 +0400 Glauber Costa wrote: > ... > > A general explanation of what this is all about follows: > > The kernel memory limitation mechanism for memcg concerns itself with > disallowing potentially non-reclaimable allocations to happen in exaggerate > quantities by a par

[Devel] Re: [PATCH v5 05/14] Add a __GFP_KMEMCG flag

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > This flag is used to indicate to the callees that this allocation is a > kernel allocation in process context, and should be accounted to > current's memcg. It takes numerical place of the of the recently removed > __GFP_NO_KSWAPD. > > [ v4: make flag u

[Devel] Re: [PATCH v5 04/14] kmem accounting basic infrastructure

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > This patch adds the basic infrastructure for the accounting of kernel > memory. To control that, the following files are created: > > * memory.kmem.usage_in_bytes > * memory.kmem.limit_in_bytes > * memory.kmem.failcnt > * memory.kmem.max_usage_in_by

[Devel] Re: [PATCH v5 03/14] memcg: change defines to an enum

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > This is just a cleanup patch for clarity of expression. In earlier > submissions, people asked it to be in a separate patch, so here it is. > > [ v2: use named enum as type throughout the file as well ] > > Signed-off-by: Glauber Costa > Acked-by: Ka

[Devel] Re: [PATCH v5 02/14] memcg: Reclaim when more than one page needed.

2012-10-17 Thread David Rientjes
On Tue, 16 Oct 2012, Glauber Costa wrote: > From: Suleiman Souhlal > > mem_cgroup_do_charge() was written before kmem accounting, and expects > three cases: being called for 1 page, being called for a stock of 32 > pages, or being called for a hugepage. If we call for 2 or 3 pages (and > both t

[Devel] Re: [PATCH v5] slab: Ignore internal flags in cache creation

2012-10-17 Thread David Rientjes
On Wed, 17 Oct 2012, Glauber Costa wrote: > Some flags are used internally by the allocators for management > purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses > to mark that the metadata for that cache is stored outside of the slab. > > No cache should ever pass those as a

[Devel] Re: [RFC PATCH v2] posix timers: allocate timer id per task

2012-10-17 Thread Stanislav Kinsbursky
17.10.2012 17:57, Eric Dumazet пишет: On Wed, 2012-10-17 at 17:18 +0400, Stanislav Kinsbursky wrote: +static int posix_timer_add(struct k_itimer *timer) +{ + struct signal_struct *sig = current->signal; + int next_free_id = sig->posix_timer_id; + struct hlist_head *head; +

[Devel] Re: [RFC PATCH v2] posix timers: allocate timer id per task

2012-10-17 Thread Stanislav Kinsbursky
17.10.2012 17:44, Eric Dumazet пишет: On Wed, 2012-10-17 at 17:18 +0400, Stanislav Kinsbursky wrote: +static int hash(struct signal_struct *sig, unsigned int nr) +{ + int hash = hash_ptr(sig, POSIX_TIMERS_HASH_BITS); + return hash ^ hash_32(nr, POSIX_TIMERS_HASH_BITS); +} + This i

[Devel] Re: [RFC PATCH v2] posix timers: allocate timer id per task

2012-10-17 Thread Eric Dumazet
On Wed, 2012-10-17 at 17:18 +0400, Stanislav Kinsbursky wrote: > +static int posix_timer_add(struct k_itimer *timer) > +{ > + struct signal_struct *sig = current->signal; > + int next_free_id = sig->posix_timer_id; > + struct hlist_head *head; > + int ret = -ENOENT; > + > + do

[Devel] Re: [RFC PATCH v2] posix timers: allocate timer id per task

2012-10-17 Thread Eric Dumazet
On Wed, 2012-10-17 at 17:18 +0400, Stanislav Kinsbursky wrote: > +static int hash(struct signal_struct *sig, unsigned int nr) > +{ > + int hash = hash_ptr(sig, POSIX_TIMERS_HASH_BITS); > + return hash ^ hash_32(nr, POSIX_TIMERS_HASH_BITS); > +} > + This is quite expensive on 64 bit arches

[Devel] [RFC PATCH v2] posix timers: allocate timer id per task

2012-10-17 Thread Stanislav Kinsbursky
v2: 1) Hash table become RCU-friendly. Hash table search now done under RCU lock protection. I've tested scalability on KVM with 4 CPU. The testing environment was build of 10 processes, each had 512 posix timers running (SIGSEV_NONE) and was calling timer_gettime() in loop. With all this stuff bei

[Devel] [PATCH v5] slab: Ignore internal flags in cache creation

2012-10-17 Thread Glauber Costa
Some flags are used internally by the allocators for management purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses to mark that the metadata for that cache is stored outside of the slab. No cache should ever pass those as a creation flags. We can just ignore this bit if it hap

[Devel] Re: [PATCH v4 14/14] Add documentation about the kmem controller

2012-10-17 Thread Kamezawa Hiroyuki
(2012/10/08 19:06), Glauber Costa wrote: > Signed-off-by: Glauber Costa > --- > Documentation/cgroups/memory.txt | 55 > +++- > 1 file changed, 54 insertions(+), 1 deletion(-) > Acked-by: KAMEZAWA Hiroyuki ___

[Devel] Re: [PATCH v5 12/14] execute the whole memcg freeing in free_worker

2012-10-17 Thread Kamezawa Hiroyuki
(2012/10/16 19:16), Glauber Costa wrote: > A lot of the initialization we do in mem_cgroup_create() is done with > softirqs enabled. This include grabbing a css id, which holds > &ss->id_lock->rlock, and the per-zone trees, which holds > rtpz->lock->rlock. All of those signal to the lockdep mechani