Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread Michal Hocko
On Mon 26-11-12 12:46:22, Johannes Weiner wrote: > On Mon, Nov 26, 2012 at 02:18:37PM +0100, Michal Hocko wrote: > > [CCing also Johannes - the thread started here: > > https://lkml.org/lkml/2012/11/21/497] > > > > On Mon 26-11-12 01:38:55, azurIt wrote: > > >

rework mem_cgroup iterator

2012-11-26 Thread Michal Hocko
hildren groups. This triggers both children only and hierarchical reclaims. The shortlog says: Michal Hocko (6): memcg: synchronize per-zone iterator access by a spinlock memcg: keep prev's css alive for the whole mem_cgroup_iter memcg: rework mem_cgroup_iter to use

[patch v2 2/6] memcg: keep prev's css alive for the whole mem_cgroup_iter

2012-11-26 Thread Michal Hocko
right after it gets the last css_id. This is correct because neither prev's memcg nor cgroup are accessed after then. This will change in the next patch so we need to hold the group alive a bit longer so let's move the css_put at the end of the function. Signed-off-by: Michal Hock

[patch v2 1/6] memcg: synchronize per-zone iterator access by a spinlock

2012-11-26 Thread Michal Hocko
will be replaced cgroup generic iteration which requires storing mem_cgroup pointer into iterator and that requires reference counting and so concurrent access will be a problem. Signed-off-by: Michal Hocko Acked-by: KAMEZAWA Hiroyuki --- mm/memcontrol.c | 12 +++- 1 file changed, 11 i

[patch v2 6/6] cgroup: remove css_get_next

2012-11-26 Thread Michal Hocko
Now that we have generic and well ordered cgroup tree walkers there is no need to keep css_get_next in the place. Signed-off-by: Michal Hocko --- include/linux/cgroup.h |7 --- kernel/cgroup.c| 49 2 files changed, 56 deletions

[patch v2 5/6] memcg: further simplify mem_cgroup_iter

2012-11-26 Thread Michal Hocko
(to __mem_cgrou_iter_next) so the distinction is more clear. This patch doesn't introduce any functional changes. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 79 --- 1 file changed, 46 insertions(+), 33 deletions(-) diff --git

[patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-11-26 Thread Michal Hocko
et,put} for iter->last_visited rather than mem_cgroup_{get,put} because it is stronger wrt. cgroup life cycle - cgroup_next_descendant_pre expects NULL pos for the first iterartion otherwise it might loop endlessly for intermediate node without any children. Signed-off-by: Michal Hocko --- mm/

[patch v2 4/6] memcg: simplify mem_cgroup_iter

2012-11-26 Thread Michal Hocko
a simple invariant that memcg is always alive when non-NULL and all nodes have been visited otherwise. We could get rid of the surrounding while loop but keep it in for now to make review easier. It will go away in the following patch. Signed-off-by: Michal Hocko --- mm/memcontrol.

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread Michal Hocko
On Mon 26-11-12 13:24:21, Johannes Weiner wrote: > On Mon, Nov 26, 2012 at 07:04:44PM +0100, Michal Hocko wrote: > > On Mon 26-11-12 12:46:22, Johannes Weiner wrote: [...] > > > I think global oom already handles this in a much better way: invoke > > > the OOM kille

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread Michal Hocko
On Mon 26-11-12 14:29:41, Johannes Weiner wrote: > On Mon, Nov 26, 2012 at 08:03:29PM +0100, Michal Hocko wrote: > > On Mon 26-11-12 13:24:21, Johannes Weiner wrote: > > > On Mon, Nov 26, 2012 at 07:04:44PM +0100, Michal Hocko wrote: > > > > On Mon 26-11-12 1

Re: [RFC v3 0/3] vmpressure_fd: Linux VM pressure notifications

2012-11-26 Thread Michal Hocko
I am not entirely sure what the first is one good for (to be honest), but I believe there are users out there. I do not think that mixing those two makes much sense. They have different usecases and until we have users for the thresholds one we should keep it. [...] Thanks -- Michal Hocko SUSE Lab

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread Michal Hocko
On Mon 26-11-12 15:19:18, Johannes Weiner wrote: > On Mon, Nov 26, 2012 at 09:08:48PM +0100, Michal Hocko wrote: [...] > > OK, I guess I am getting what you are trying to say. So what you are > > suggesting is to just let mem_cgroup_out_of_memory send the signal and > > move

Re: mm/vmemmap: fix wrong use of virt_to_page

2012-11-27 Thread Michal Hocko
9700() GS:8804570c() > knlGS:0 > 000 > [ 517.740170] CS: 0010 DS: ES: CR0: 8005003b > [ 517.740170] CR2: 7f006dc5fd14 CR3: 000440e85000 CR4: > 07e0 > > [ 517.740170] DR0: 0000 DR1: 000

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-27 Thread Michal Hocko
be done by a small fixes to memory.c. I wouldn't call it simple but it is doable. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.ker

[PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-27 Thread Michal Hocko
(page, current->mm, > > > gfp_mask & GFP_RECLAIM_MASK); > > > if (error) > > > goto out; > > Shmem does not use this function but also charges under the i_mutex in > the write path and fallocate at least. Right y

Re: [PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-27 Thread Michal Hocko
Sorry, forgot to about one shmem charge: --- >From 7ae29927d24471c1b1a6ceb021219c592c1ef518 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Tue, 27 Nov 2012 21:53:13 +0100 Subject: [PATCH] memcg: do not trigger OOM from add_to_page_cache_locked memcg oom killer might deadlock if the proc

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-11-28 Thread Michal Hocko
On Wed 28-11-12 17:47:59, KAMEZAWA Hiroyuki wrote: > (2012/11/27 3:47), Michal Hocko wrote: [...] > > + /* > > +* Even if we found a group we have to make sure it is alive. > > +* css && !memcg means that the groups should be skip

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-11-28 Thread Michal Hocko
On Wed 28-11-12 13:23:57, Glauber Costa wrote: > On 11/28/2012 01:17 PM, Michal Hocko wrote: > > On Wed 28-11-12 17:47:59, KAMEZAWA Hiroyuki wrote: > >> (2012/11/27 3:47), Michal Hocko wrote: > > [...] > >>> + /* > >>> + * Even if w

Re: [PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-28 Thread Michal Hocko
On Wed 28-11-12 10:26:31, Johannes Weiner wrote: > On Tue, Nov 27, 2012 at 09:59:44PM +0100, Michal Hocko wrote: > > @@ -3863,7 +3862,7 @@ int mem_cgroup_cache_charge(struct page *page, struct > > mm_struct *mm, > > return 0; > > > > if (!PageSwa

Re: [RFC] Add mempressure cgroup

2012-11-28 Thread Michal Hocko
as I mentioned above (so it at least wouldn't work with a co-mounted cases). > /* reclaim/compaction might need reclaim to continue */ > if (should_continue_reclaim(lruvec, nr_reclaimed, > sc->nr_scanned - nr_scanned, sc)) > @@ -2099,6

Re: [PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-28 Thread Michal Hocko
On Wed 28-11-12 11:37:36, Johannes Weiner wrote: > On Wed, Nov 28, 2012 at 05:04:47PM +0100, Michal Hocko wrote: > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 095d2b4..5abe441 100644 > > --- a/include/linux/memcontrol.h > > +++ b

Re: [PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-28 Thread Michal Hocko
On Wed 28-11-12 17:46:40, Michal Hocko wrote: > On Wed 28-11-12 11:37:36, Johannes Weiner wrote: > > On Wed, Nov 28, 2012 at 05:04:47PM +0100, Michal Hocko wrote: > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > > index 095d2b4..5abe441

[PATCH] memcg: do not check for mm in mem_cgroup_count_vm_event disabled

2012-11-29 Thread Michal Hocko
On Wed 28-11-12 15:29:30, Hugh Dickins wrote: > On Wed, 21 Nov 2012, Michal Hocko wrote: > > On Tue 20-11-12 13:49:32, Andrew Morton wrote: > > > On Mon, 19 Nov 2012 17:44:34 -0800 (PST) > > > David Rientjes wrote: [...] > > > > -void mem_cgroup_co

Re: [PATCH -v2 -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-29 Thread Michal Hocko
ny other possibilities to solve this issue? Or do you think we should ignore the problem just because nobody complained for such a long time? Dunno, I think we should fix this with something less risky for now and come up with a real fix after it sees sufficient testing. > I wonder why this issue has h

Re: [patch v2 6/6] cgroup: remove css_get_next

2012-11-30 Thread Michal Hocko
On Fri 30-11-12 13:12:29, KAMEZAWA Hiroyuki wrote: > (2012/11/27 3:47), Michal Hocko wrote: > > Now that we have generic and well ordered cgroup tree walkers there is > > no need to keep css_get_next in the place. > > > > Signed-off-by: Michal Hocko > >

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-30 Thread Michal Hocko
ges if we race with one of the above hierarchy operation. Swappiness and oom control are not a big deal. Same applies to migration policy change. Those could be solved by using the same memcg lock in the attach hook. Hierarchy policy change would be a bigger issue because the task is already linke

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-30 Thread Michal Hocko
care of changes in hierarchy, > > Having a new memcg's mutex in ->create() may be a way. > > > > Ah, hm, Costa is mentioning task-attach. is the task-attach problem in > > memcg ? > > > > We disallow the kmem limit to be set if a task already exists in the > cgroup. So we can't allow a new task to attach if we are setting the limit. This is racy without additional locking, isn't it? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-11-30 Thread Michal Hocko
is. I could cherry-pick the series after it is settled. I have no idea how much conflicts this would bring, though. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More ma

Re: [patch v2 5/6] memcg: further simplify mem_cgroup_iter

2012-11-30 Thread Michal Hocko
On Fri 30-11-12 13:08:35, Glauber Costa wrote: > On 11/26/2012 10:47 PM, Michal Hocko wrote: > > The code would be much more easier to follow if we move the iteration > > outside of the function (to __mem_cgrou_iter_next) so the distinction > > is more clear. > tot

Re: [PATCH 2/2] Drivers: hv: balloon: Support 2M page allocations for ballooning

2013-03-18 Thread Michal Hocko
granularity (when the host returns the memory). Maybe I am missing something but what is the advantage of 2M allocation when you split it up immediately so you are not using it as a huge page? [...] -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe lin

Re: [PATCH 2/2] Drivers: hv: balloon: Support 2M page allocations for ballooning

2013-03-18 Thread Michal Hocko
On Mon 18-03-13 11:52:57, Michal Hocko wrote: > On Sat 16-03-13 14:42:05, K. Y. Srinivasan wrote: > > While ballooning memory out of the guest, attempt 2M allocations first. > > If 2M allocations fail, then go for 4K allocations. In cases where we > > have performed 2M alloc

Re: [PATCH 1/2] mm: Export split_page()

2013-03-18 Thread Michal Hocko
ny objections to exporting the symbol (at least we prevent drivers code from inventing their own split_page) but the Hyper-V specific description should go into Hyper-V patch IMO. So for the export with a short note that the symbol will be used by Hyper-V Acked-by: Michal Hocko > --- > mm/page

Re: [PATCH 2/2] Drivers: hv: balloon: Support 2M page allocations for ballooning

2013-03-18 Thread Michal Hocko
On Mon 18-03-13 13:44:05, KY Srinivasan wrote: > > > > -Original Message- > > From: Michal Hocko [mailto:mho...@suse.cz] > > Sent: Monday, March 18, 2013 6:53 AM > > To: KY Srinivasan > > Cc: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org;

Re: [PATCH 1/9] migrate: add migrate_entry_wait_huge()

2013-03-18 Thread Michal Hocko
ne to be HugePage aware instead? All it takes is just opencoding pte_offset_map_lock and calling huge_ptep_get ofr HugePage and pte_offset_map otherwise. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord

Re: [PATCH 2/9] migrate: make core migration code aware of hugepage

2013-03-18 Thread Michal Hocko
VM_PFNMAP)) > return 0; Is this safe? At least check_*_range don't seem to be hugetlb aware. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo in

Re: [PATCH 2/9] migrate: make core migration code aware of hugepage

2013-03-18 Thread Michal Hocko
On Mon 18-03-13 16:22:24, Michal Hocko wrote: > On Thu 21-02-13 14:41:41, Naoya Horiguchi wrote: > [...] > > diff --git v3.8.orig/include/linux/mempolicy.h > > v3.8/include/linux/mempolicy.h > > index 0d7df39..2e475b5 100644 > > --- v3.8.orig/include/linux/mempolic

Re: [PATCH 5/9] migrate: enable migrate_pages() to migrate hugepage

2013-03-18 Thread Michal Hocko
inline int check_pmd_range(struct vm_area_struct > *vma, pud_t *pud, > pmd = pmd_offset(pud, addr); > do { > next = pmd_addr_end(addr, end); > + if (pmd_huge(*pmd) && is_vm_hugetlb_page(vma)) { Why an explicit check for is_vm_hugetlb_pag

Re: [PATCH 9/9] remove /proc/sys/vm/hugepages_treat_as_movable

2013-03-18 Thread Michal Hocko
size_t *length, loff_t *ppos) > -{ > - proc_dointvec(table, write, buffer, length, ppos); > - if (hugepages_treat_as_movable) > - htlb_alloc_mask = GFP_HIGHUSER_MOVABLE; > - else > - htlb_alloc_mask = GFP_HIGHUSER; > - r

Re: [PATCH 8/9] memory-hotplug: enable memory hotplug to handle hugepage

2013-03-18 Thread Michal Hocko
head = compound_head(page); > + pfn = page_to_pfn(head) + > + (1 << compound_order(head)) - 1; > + put_page(page); > + continue; > + } > +

Re: [PATCH 2/2] Drivers: hv: balloon: Support 2M page allocations for ballooning

2013-03-18 Thread Michal Hocko
tree would be less confusing /me thinks. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 5/9] migrate: enable migrate_pages() to migrate hugepage

2013-03-19 Thread Michal Hocko
On Mon 18-03-13 20:07:16, Naoya Horiguchi wrote: > On Mon, Mar 18, 2013 at 04:40:57PM +0100, Michal Hocko wrote: > > On Thu 21-02-13 14:41:44, Naoya Horiguchi wrote: [...] > > > @@ -3202,3 +3202,13 @@ void putback_active_hugepages(struct list_head *l) > > > list_for_

Re: [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop

2013-03-19 Thread Michal Hocko
> >the scanning prioirty higher unless it is failing to reclaim any pages. > > > >To avoid infinite looping for high-order allocation requests kswapd will > >not reclaim for high-order allocations when it has reclaimed at least > >twice the number of pages as

[PATCH] memcg: do not check for do_swap_account in mem_cgroup_{read,write,reset}

2013-03-19 Thread Michal Hocko
since 2d11085e (memcg: do not create memsw files if swap accounting is disabled) memsw files are created only if memcg swap accounting is enabled so there doesn't make any sense to check for it explicitely in mem_cgroup_read, mem_cgroup_write and mem_cgroup_reset. Signed-off-by: Michal

Re: [PATCH V2 1/3] mm: Export split_page()

2013-03-19 Thread Michal Hocko
On Mon 18-03-13 13:51:36, K. Y. Srinivasan wrote: > This symbol would be used in the Hyper-V balloon driver to support 2M > allocations. > > In this version of the patch, based on feedback from Michal Hocko > , I have updated the patch description. I guess this part i

Re: [PATCH V2 2/3] Drivers: hv: balloon: Support 2M page allocations for ballooning

2013-03-19 Thread Michal Hocko
freed as 4K pages. How many pages are requested usually? > If 2M allocations fail, we revert to 4K allocations. > > In this version of the patch, based on the feedback from Michal Hocko > , I have added some additional commentary to the patch > description. > > Signed-off

Re: [bugfix] mm: zone_end_pfn is too small

2013-03-19 Thread Michal Hocko
<0f> 0b eb fe 0f > 0b 0f 1f 84 00 00 00 00 00 eb f6 0f 0b eb fe 49 > RIP [] free_one_page+0x382/0x430 >RSP > ---[ end trace a7919e7f17c0a725 ]--- > Kernel panic - not syncing: Attempted to kill the idle task! > > Signed-off-by: Russ Anderson > Repor

Re: [PATCH 5/9] migrate: enable migrate_pages() to migrate hugepage

2013-03-20 Thread Michal Hocko
On Wed 20-03-13 02:12:54, Naoya Horiguchi wrote: > On Tue, Mar 19, 2013 at 08:11:13AM +0100, Michal Hocko wrote: > > On Mon 18-03-13 20:07:16, Naoya Horiguchi wrote: > > > On Mon, Mar 18, 2013 at 04:40:57PM +0100, Michal Hocko wrote: > > > > On Thu 21-02-13 1

Re: [PATCH 8/9] memory-hotplug: enable memory hotplug to handle hugepage

2013-03-20 Thread Michal Hocko
On Tue 19-03-13 23:55:33, Naoya Horiguchi wrote: > On Mon, Mar 18, 2013 at 05:07:37PM +0100, Michal Hocko wrote: > > On Thu 21-02-13 14:41:47, Naoya Horiguchi wrote: [...] > > > As for larger hugepages (1GB for x86_64), it's not easy to do hotremove > > > over them

Re: [patch] mm, hugetlb: include hugepages in meminfo

2013-03-20 Thread Michal Hocko
by for_each_hstate to do that. With that applied, feel free to add my Acked-by: Michal Hocko Thanks > Booting with hugepages=8192 on the command line, this memory is now shown > in oom conditions. For example, with echo m > /proc/sysrq-trigger: > > Node 0 hugepages_total=2

Re: [PATCH 01/10] mm: vmscan: Limit the number of pages kswapd reclaims at each priority

2013-03-20 Thread Michal Hocko
one_reclaimable(zone)) > + zone->all_unreclaimable = 1; > +} > + > +/* > * For kswapd, balance_pgdat() will work across all this node's zones until > * they are all at high_wmark_pages(zone). > * -- Michal Hocko SUSE Labs -- To unsubscribe from this li

Re: [PATCH] mm: page_alloc: Avoid marking zones full prematurely after zone_reclaim()

2013-03-20 Thread Michal Hocko
sing the zone to be > marked full prematurely. > > This patch will only mark the zone full after zone_reclaim if it the min > watermarks are checked or if page reclaim failed to make sufficient > progress. > > Reported-and-tested-by: Hedi Berriche > Signed-off-by: Mel G

Re: [patch] mm, hugetlb: include hugepages in meminfo

2013-03-20 Thread Michal Hocko
On Wed 20-03-13 11:46:12, David Rientjes wrote: > On Wed, 20 Mar 2013, Michal Hocko wrote: > > > On Tue 19-03-13 17:18:12, David Rientjes wrote: > > > Particularly in oom conditions, it's troublesome that hugetlb memory is > > > not displayed. All other memi

Re: [patch] mm, hugetlb: include hugepages in meminfo

2013-03-20 Thread Michal Hocko
On Wed 20-03-13 11:58:20, David Rientjes wrote: > On Wed, 20 Mar 2013, Michal Hocko wrote: > > > > I didn't do this because it isn't already exported in /proc/meminfo and > > > since we've made an effort to reduce the amount of information emitted by &

Re: [patch v2] mm, hugetlb: include hugepages in meminfo

2013-03-20 Thread Michal Hocko
=2048 hugepages_surp=0 > hugepages_size=2048kB > Node 3 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 > hugepages_size=2048kB > > Acked-by: Michal Hocko > Signed-off-by: David Rientjes Thank you! > --- > include/linux/hugetlb.h | 4 > mm/hugetlb.c

Re: [PATCH] mm: page_alloc: Avoid marking zones full prematurely after zone_reclaim()

2013-03-21 Thread Michal Hocko
e not modified so they are clean. Output file is /dev/null so no pages are written. dd doesn't call fadvise(POSIX_FADV_DONTNEED) on the input file by default so pages from the file stay in the page cache -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "un

Re: [PATCH] mm: page_alloc: Avoid marking zones full prematurely after zone_reclaim()

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 16:32:03, Simon Jeons wrote: > Hi Michal, > On 03/21/2013 04:19 PM, Michal Hocko wrote: > >On Thu 21-03-13 10:33:07, Simon Jeons wrote: > >>Hi Mel, > >>On 03/21/2013 02:19 AM, Mel Gorman wrote: > >>>The following problem was r

Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

2013-03-21 Thread Michal Hocko
this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 10:08:49, Michal Hocko wrote: > On Thu 21-03-13 09:22:21, Li Zefan wrote: > > As cgroup supports rename, it's unsafe to dereference dentry->d_name > > without proper vfs locks. Fix this by using cgroup_name(). > > > > Signed-off-by: Li Zefan >

Re: [patch] mm: speedup in __early_pfn_to_nid

2013-03-21 Thread Michal Hocko
more > apparent that they are not on-stack. I only noticed it in the second pass. Wouldn't this just add more confision with other _pfn variables? (e.g. {min,max}_low_pfn and others) IMO the local scope is more obvious as this is and should only be used for caching purposes. -- Michal Ho

Re: [RFC][PATCH 0/9] extend hugepage migration

2013-03-21 Thread Michal Hocko
_ERR "hugepagesz: Unsupported page size %lu M\n", > ps >> 20); > > I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages. > What's the difference between these pages which I hacking and normal > huge pages? How is this related to the patch set? Please

Re: [PATCH 01/10] mm: vmscan: Limit the number of pages kswapd reclaims at each priority

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 09:47:13, Mel Gorman wrote: > On Wed, Mar 20, 2013 at 05:18:47PM +0100, Michal Hocko wrote: > > On Sun 17-03-13 13:04:07, Mel Gorman wrote: > > [...] > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > index 88c5fed..4835a7a 100644 > > >

Re: [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd

2013-03-21 Thread Michal Hocko
> - * However, if the VM has a harder time of freeing pages, > - * with multiple processes reclaiming pages, the total > - * freeing target can get unreasonably large. > - */ > - if (nr_reclaimed >= nr_to_reclaim && > -

Re: [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop

2013-03-21 Thread Michal Hocko
r_reclaimed - nr_reclaimed == 0) is redundant because you already set raise_priority above in that case. > + sc.priority--; > + } while (sc.priority >= 0 && > + !pgdat_balanced(pgdat, order, *classzone_idx)); > > /* > * If kswap

Re: [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 14:31:15, Mel Gorman wrote: > On Thu, Mar 21, 2013 at 03:01:54PM +0100, Michal Hocko wrote: > > On Sun 17-03-13 13:04:08, Mel Gorman wrote: > > > Simplistically, the anon and file LRU lists are scanned proportionally > > > depending on the value of vm.sw

Re: [PATCH 04/10] mm: vmscan: Decide whether to compact the pgdat based on reclaim progress

2013-03-21 Thread Michal Hocko
struct zone *zone = pgdat->node_zones + i; > - > - if (!populated_zone(zone)) > - continue; > - > - /* Check if the memory needs to be defragmented. */ > - if (zone_watermark_ok(zone, order, > -

Re: [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 15:26:02, Mel Gorman wrote: > On Thu, Mar 21, 2013 at 03:54:58PM +0100, Michal Hocko wrote: > > > Signed-off-by: Mel Gorman > > > --- > > > mm/vmscan.c | 86 > > > ++--- > > >

Re: [PATCH 05/10] mm: vmscan: Do not allow kswapd to scan at maximum priority

2013-03-21 Thread Michal Hocko
; of an OOM situation. OK, it should work. raise_priority should prevent from pointless lowerinng the priority and if there is really nothing to reclaim then relying on the direct reclaim is probably a better idea. > Signed-off-by: Mel Gorman Reviewed-by: Michal Hocko > --- >

Re: [PATCH 04/10] mm: vmscan: Decide whether to compact the pgdat based on reclaim progress

2013-03-21 Thread Michal Hocko
On Thu 21-03-13 15:47:31, Mel Gorman wrote: > On Thu, Mar 21, 2013 at 04:32:31PM +0100, Michal Hocko wrote: > > On Sun 17-03-13 13:04:10, Mel Gorman wrote: > > > In the past, kswapd makes a decision on whether to compact memory after > > > the > > > pgdat w

Re: [PATCH 07/10 -v2r1] mm: vmscan: Block kswapd if it is encountering pages under writeback

2013-03-21 Thread Michal Hocko
e > and block waiting for some IO to complete. > > Signed-off-by: Mel Gorman Looks reasonable to me. Reviewed-by: Michal Hocko > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index afedd1d..dd0d266 100644 > --- a/include/linux/mmzone.h > +++

Re: [PATCH] memcg: stop warning on memcg_propagate_kmem

2013-02-03 Thread Michal Hocko
ot > used [-Wunused-function]" seen in 3.8-rc: move the #ifdef outwards. > > Signed-off-by: Hugh Dickins Acked-by: Michal Hocko Hmm, if you are not too tired then moving the function downwards to where it is called (memcg_init_kmem) will reduce the number of ifdefs. But this can wa

Re: [PATCH] memcg: stop warning on memcg_propagate_kmem

2013-02-04 Thread Michal Hocko
On Mon 04-02-13 12:04:06, Glauber Costa wrote: > On 02/04/2013 11:57 AM, Michal Hocko wrote: > > On Sun 03-02-13 20:29:01, Hugh Dickins wrote: > >> Whilst I run the risk of a flogging for disloyalty to the Lord of Sealand, > >> I do have CONFIG_MEMCG=y CONFIG_MEMCG_KMEM

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread Michal Hocko
On Fri 25-01-13 17:31:30, Michal Hocko wrote: > On Fri 25-01-13 16:07:23, azurIt wrote: > > Any news? Thnx! > > Sorry, but I didn't get to this one yet. Sorry, to get back to this that late but I was busy as hell since the beginning of the year. Has the issue repeated since

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread Michal Hocko
: --- >From f2bf8437d5b9bb38a95a432bf39f32c584955171 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon, 26 Nov 2012 11:47:57 +0100 Subject: [PATCH] memcg: do not trigger OOM from add_to_page_cache_locked memcg oom killer might deadlock if the process which falls down to mem_cgroup_handle_oom holds a lock

[PATCH 1/3] memcg: move mem_cgroup_soft_limit_tree_init to mem_cgroup_init

2013-02-05 Thread Michal Hocko
are at it let's make mem_cgroup_soft_limit_tree_init void because it doesn't make much sense to report memory failure because if we fail to allocate memory that early during the boot then we are screwed anyway (this saves some code). Signed-off-by: Michal Hocko --- mm/memcontro

[PATCH 3/3] memcg: cleanup mem_cgroup_init comment

2013-02-05 Thread Michal Hocko
We should encourage all memcg controller initialization independent on a specific mem_cgroup to be done here rather than exploit css_alloc callback and assume that nothing happens before root cgroup is created. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 10 ++ 1 file changed, 6

[PATCH 0/3] cleanup memcg controller initialization

2013-02-05 Thread Michal Hocko
+ 1 file changed, 21 insertions(+), 28 deletions(-) Shortlog says: Michal Hocko (3): memcg: move mem_cgroup_soft_limit_tree_init to mem_cgroup_init memcg: move memcg_stock initialization to mem_cgroup_init memcg: cleanup mem_cgroup_init

[PATCH 2/3] memcg: move memcg_stock initialization to mem_cgroup_init

2013-02-05 Thread Michal Hocko
current memcg_stock initialization code into a helper calls it from the controller subsystem initialization code. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread Michal Hocko
cg being stuck. Certain access to soem files might be an issue because those could have locks held but I do not see other relations. I would start by checking the HW, trying to focus on reducing elements that could contribute - aka try to nail down to the minimum set which reproduces the issue. I

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread Michal Hocko
On Tue 05-02-13 08:48:23, Greg Thelen wrote: > On Tue, Feb 05 2013, Michal Hocko wrote: > > > On Tue 05-02-13 15:49:47, azurIt wrote: > > [...] > >> Just to be sure - am i supposed to apply this two patches? > >> http://watchdog.sk/lkml/patches/ > > &

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread Michal Hocko
On Tue 05-02-13 10:09:57, Greg Thelen wrote: > On Tue, Feb 05 2013, Michal Hocko wrote: > > > On Tue 05-02-13 08:48:23, Greg Thelen wrote: > >> On Tue, Feb 05 2013, Michal Hocko wrote: > >> > >> > On Tue 05-02-13 15:49:47, azurIt wrote: > >>

[PATCH] da9030_battery: include notifier.h

2013-02-06 Thread Michal Hocko
randconfig complains about: drivers/power/da9030_battery.c:113: error: field ‘nb’ has incomplete type because there is no direct include for notifier.h which defines struct notifier_block. Signed-off-by: Michal Hocko --- drivers/power/da9030_battery.c |1 + 1 file changed, 1 insertion

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-06 Thread Michal Hocko
same we do for NO_IO in the current -mm tree. The later one seems easier wrt. gfp_mask passing horror - e.g. __generic_file_aio_write doesn't pass flags and it can be called from unlocked contexts as well. I have to think about it some more. -- Michal Hocko SUSE Labs -- To unsubscribe from

Re: [PATCH] da9030_battery: include notifier.h

2013-02-06 Thread Michal Hocko
Ohh, I have just noticed that this could be introduced by "mm: break circular include from linux/mmzone.h" in mm tree. Adding Andrew to CC. On Wed 06-02-13 10:14:58, Michal Hocko wrote: > randconfig complains about: > drivers/power/da9030_battery.c:113: error: field ‘nb’ has

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-06 Thread Michal Hocko
On Wed 06-02-13 15:01:19, Michal Hocko wrote: > On Wed 06-02-13 02:17:21, azurIt wrote: > > >5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I > > >mentioned in a follow up email. Here is the full patch: > > > > > > Here is t

[PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-06 Thread Michal Hocko
On Wed 06-02-13 15:22:19, Michal Hocko wrote: > On Wed 06-02-13 15:01:19, Michal Hocko wrote: > > On Wed 06-02-13 02:17:21, azurIt wrote: > > > >5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I > > > >mentioned in a f

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-07 Thread Michal Hocko
On Thu 07-02-13 20:01:45, KAMEZAWA Hiroyuki wrote: > (2013/02/06 23:01), Michal Hocko wrote: > >On Wed 06-02-13 02:17:21, azurIt wrote: > >>>5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I > >>>mentioned in a follow up email. Here is the

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread Michal Hocko
t/torvalds/linux-2.6.git;a=commitdiff;h=321fb561 and the commit description claim this shouldn't happen. I am not familiar with this code but it sounds like a bug in the tracing code which is not related to the discussed issue. > Finally i rebooted into different kernel, wrote this e-mail and go

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread Michal Hocko
that a limit for those children? > As i said, these two processes Which are those two processes? > were stucked and was impossible to kill them. They were, > maybe, the processes which i was trying to 'strace' before - 'strace' > was freezed as always when the cg

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread Michal Hocko
On Fri 08-02-13 14:56:16, azurIt wrote: > Data are inside memcg-bug-5.tar.gz in directories bug/// ohh, I didn't get those were timestamp directories. It makes more sense now. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread Michal Hocko
oesn't look very unhealthy. I have expected that write would fail more often but it seems that the biggest memory pressure comes from mmaps and page faults which have no way other than OOM. So my suggestion would be to reconsider limits for groups to provide more realistical environment. -- M

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-08 Thread Michal Hocko
cg->res, val); > if (batch->memsw_nr_pages) > - res_counter_uncharge(&batch->memcg->memsw, > - batch->memsw_nr_pages * PAGE_SIZE); > + res_counter_uncharge(&batch->memcg->memsw, val); > memcg_oo

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-08 Thread Michal Hocko
On Thu 07-02-13 20:27:00, Greg Thelen wrote: > On Tue, Feb 05 2013, Michal Hocko wrote: > > > On Tue 05-02-13 10:09:57, Greg Thelen wrote: > >> On Tue, Feb 05 2013, Michal Hocko wrote: > >> > >> > On Tue 05-02-13 08:48:23, Greg Thelen wrote: >

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-08 Thread Michal Hocko
On Fri 08-02-13 17:29:18, Michal Hocko wrote: [...] > OK, I have checked the allocator slow path and you are right even > GFP_KERNEL will not fail. This can lead to similar deadlocks - e.g. > OOM killed task blocked on down_write(mmap_sem) while the page fault > handler holding

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread Michal Hocko
esses dying because it doesn't expect it. > And there is still a mystery of two freezed processes which cannot be > killed. > > By the way, i KNOW that so much OOM is not healthy but the client > simply don't want to buy more memory. He knows about the problem of > uns

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-10 Thread Michal Hocko
ngs are happening. I'm sure there is something more it > this, maybe it revealed another bug? So far nothing shows that there would be anything broken wrt. memcg OOM killer. The ptrace issue sounds strange, all right, but that is another story and worth a separate investigation. I wo

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-11 Thread Michal Hocko
e group is under permanent OOM condition and the task is not selected to be killed. Unfortunately I am not able to reproduce this behavior even if I try to hammer OOM like mad so I am afraid I cannot help you much without further debugging patches. I do realize that experimenting in your envi

Re: [PATCH v3 4/7] memcg: remove memcg from the reclaim iterators

2013-02-11 Thread Michal Hocko
icit cleanup because it brings yet another kind of generation number to the game but I guess I can live with it if people really thing the relaxed way is much better. What do you think about the patch below (untested yet)? --- >From 8169aa49649753822661b8fbbfba0852dcfedba6 Mon Sep 17 00:00:00 2001

Re: [PATCH v3 4/7] memcg: remove memcg from the reclaim iterators

2013-02-11 Thread Michal Hocko
On Mon 11-02-13 12:56:19, Johannes Weiner wrote: > On Mon, Feb 11, 2013 at 04:16:49PM +0100, Michal Hocko wrote: > > On Fri 08-02-13 14:33:18, Johannes Weiner wrote: > > [...] > > > for each in hierarchy: > > > for each node: > > > for each z

Re: [PATCH v3 4/7] memcg: remove memcg from the reclaim iterators

2013-02-11 Thread Michal Hocko
On Mon 11-02-13 14:58:24, Johannes Weiner wrote: > On Mon, Feb 11, 2013 at 08:29:29PM +0100, Michal Hocko wrote: > > On Mon 11-02-13 12:56:19, Johannes Weiner wrote: > > > On Mon, Feb 11, 2013 at 04:16:49PM +0100, Michal Hocko wrote: > > > > Maybe we could keep the co

Re: [PATCH v3 4/7] memcg: remove memcg from the reclaim iterators

2013-02-11 Thread Michal Hocko
On Mon 11-02-13 22:27:56, Michal Hocko wrote: [...] > I will get back to this tomorrow. Maybe not a great idea as it is getting late here and brain turns into cabbage but there we go: --- >From f927358fe620837081d7a7ec6bf27af378deb35d Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon,

  1   2   3   4   5   6   7   8   9   10   >