On Wed 12-06-13 14:27:05, David Rientjes wrote:
> On Wed, 12 Jun 2013, Michal Hocko wrote:
>
> > But the objective is to handle oom deadlocks gracefully and you cannot
> > possibly miss those as they are, well, _deadlocks_.
>
> That's not at all the objective, t
On Thu 13-06-13 16:57:23, Richard Weinberger wrote:
> Am 13.06.2013 16:45, schrieb Richard Weinberger:
> >Am 13.06.2013 16:39, schrieb Michal Hocko:
> >>On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> >>>Am 13.06.2013 15:32, schrieb Michal Hocko:
> >
> v2:
> - added wmb() in kmem_cgroup_css_offline(), pointed out by Michal
> - revised comments as suggested by Michal
> - fixed to check if kmem is activated in kmem_cgroup_css_offline()
>
> Signed-off-by: Li Zefan
> Acked-by: Michal Hocko
> Acked-by: KAMEZAWA Hiroyuki
[Fix Glauber's new address]
On Thu 13-06-13 17:53:19, Michal Hocko wrote:
> On Thu 13-06-13 17:12:55, Li Zefan wrote:
> > Sorry for updating the patchset so late.
> >
> > I've made some changes for the memory barrier thing, and I agree with
> > Michal that
sb(struct super_block *);
> long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages,
> enum wb_reason reason);
> -long wb_do_writeback(struct bdi_writeback *wb, int force_wait);
> void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
> void inode_wait_for_writeba
On Thu 13-06-13 13:34:46, David Rientjes wrote:
> On Thu, 13 Jun 2013, Michal Hocko wrote:
>
> > > Right now it appears that that number of users is 0 and we're talking
> > > about a problem that was reported in 3.2 that was released a year and a
> > >
On Fri 14-06-13 12:29:52, Kirill A. Shutemov wrote:
> Michal Hocko wrote:
> > On Fri 14-06-13 15:30:34, Wanpeng Li wrote:
> > > There is just one caller in fs-writeback.c call wb_do_writeback and
> > > current codes unnecessary export it in header file, this pat
hen caches might live longer than css_offline.
> + */
> static int cgroup_destroy_locked(struct cgroup *cgrp)
> __releases(&cgroup_mutex) __acquires(&cgroup_mutex)
> {
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kerne
) so any improvement
would be really welcome.
Sorry, if this information has been posted along with the series. I was
CCed only on this one and didn't get to look at the rest yet (apart from
"percpu: implement generic percpu refcounting" in your
review-css-percpu-ref branch).
[...]
--
Michal
On Fri 14-06-13 18:15:12, Glauber Costa wrote:
> On Fri, Jun 14, 2013 at 02:55:39PM +0200, Michal Hocko wrote:
> > On Wed 12-06-13 21:04:58, Tejun Heo wrote:
> > [...]
> > > +/**
> > > + * cgroup_destroy_locked - the first stage of cgroup destruction
> &
ee:%lu slab_reclaimable:%lu slab_unreclaimable:%lu\n"
> + "mapped:%lu shmem:%lu pagetables:%lu bounce:%lu\n"
> + "free_cma:%lu\n",
> global_page_state(NR_ACTIVE_ANON),
> global_page_state(NR_INACTIVE_ANON),
>
, interesting. This comment has been incorrect since f0c0b2b808
(change zonelist order: zonelist order selection logic).
>
> Signed-off-by: Wanpeng Li
Reviewed-by: Michal Hocko
> ---
> Documentation/sysctl/vm.txt | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
g Li
Reviewed-by: Michal Hocko
> ---
> arch/x86/mm/pgtable.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 17fda6a..cb787da 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/p
SON_FREE_MORE_MEM,
> WB_REASON_FS_FREE_SPACE,
> +/*
> + * There is no bdi forker thread any more and works are done by emergency
> + * worker, however, this is somewhat userland visible and we'll be exposing
> + * exactly the same information, so it has a mismatch name.
> + */
Mel Gorman
> Reported-and-Tested-by: Fengguang Wu
Yes it fixes a flood of "Bad page state" messages I was seeing during
boot while testing my patches on top of mm tree. I just didn't get to
reporting the issue as there seem to be more of them. I will report
others after I have som
y to
measure some memcg workloads as soon as I have some spare cycles. I do
not expect a big win but I also do not think this would regress.
Thanks.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
On Tue 11-06-13 17:43:53, Michal Hocko wrote:
> JFYI, I have rebased the series on top of the current mmotm tree to
> catch up with Mel's changes in reclaim and other small things here and
> there. To be sure that the things are still good I have started my tests
> again which wi
81122d9c
[...]
81122d9c: 0f 0b ud2
RAX is -1UL.
I assume that the current backtrace is of no use and it would most
probably be some shrinker which doesn't behave.
Any idea how to pin this down?
Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from th
e were some compaction locking related patches merged
around 3.7. See 2a1402aa044b55c2d30ab0ed9405693ef06fb07c and follow ups.
> I'll send the exact numbers as soon I'll reproduce it again.
> It can take up to 1 week.
>
> Thanks!
>
> Regards,
> Rom
On Mon 17-06-13 19:14:12, Glauber Costa wrote:
> On Mon, Jun 17, 2013 at 04:18:22PM +0200, Michal Hocko wrote:
> > Hi,
>
> Hi,
>
> > I managed to trigger:
> > [ 1015.776029] kernel BUG at mm/list_lru.c:92!
> > [ 1015.776029] invalid opcode: [#1] SMP
>
On Mon 03-06-13 14:30:18, Johannes Weiner wrote:
> On Mon, Jun 03, 2013 at 12:48:39PM -0400, Johannes Weiner wrote:
> > On Mon, Jun 03, 2013 at 05:34:32PM +0200, Michal Hocko wrote:
[...]
> > > I am just afraid about all the other archs that do not support (from
> > &g
On Mon 03-06-13 14:17:54, David Rientjes wrote:
> On Mon, 3 Jun 2013, Michal Hocko wrote:
>
> > > What do you suggest when you read the "tasks" file and it returns -ENOMEM
> > > because kmalloc() fails because the userspace oom handler's memcg is also
On Mon 03-06-13 18:07:37, Tejun Heo wrote:
> On Mon, Jun 03, 2013 at 12:18:51PM +0200, Michal Hocko wrote:
> > The caller of the iterator might know that some nodes or even subtrees
> > should be skipped but there is no way to tell iterators about that so
> > the only
ng
> for more feedback :)
The manual node migration code seems to be OK in case B as well because
Reserved are skipped (check check_pte_range called from down the
do_migrate_pages path).
Maybe auto-numa code is missing this check assuming that it cannot
encounter reserved pages.
migrate_misplaced_page relies
On Tue 04-06-13 21:57:56, Balbir Singh wrote:
> On Mon, Jun 3, 2013 at 3:48 PM, Michal Hocko wrote:
> > Hi,
> >
> > This is the fourth version of the patchset.
> >
> > Summary of versions:
> > The first version has been posted here:
> > http://pe
ean by
limitations?
The priority-0 scan was always a crude hack. With a lot of pages in on
the LRU it might cause huge big stalls during direct reclaim. There are
workloads which benefited from such an aggressive reclaim - e.g.
streaming IO but that doesn't justify this kind of reclaim.
--
Michal
On Tue 04-06-13 14:48:52, Johannes Weiner wrote:
> On Tue, Jun 04, 2013 at 11:17:49AM +0200, Michal Hocko wrote:
[...]
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 6dc1882..ff5e2d7 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> >
On Tue 04-06-13 12:36:19, Tejun Heo wrote:
> Hey, Michal.
>
> On Tue, Jun 04, 2013 at 03:45:23PM +0200, Michal Hocko wrote:
> > Is this something that you find serious enough to block this series?
> > I do not want to push hard but I would like to settle with something
> &
On Tue 04-06-13 13:54:26, Tejun Heo wrote:
> Hey,
>
> On Tue, Jun 04, 2013 at 10:48:07PM +0200, Michal Hocko wrote:
> > > I really don't think memcg can afford to add more mess than there
> > > already is. Let's try to get things right with each change,
On Tue 04-06-13 23:54:45, Frank Mehnert wrote:
> On Tuesday 04 June 2013 20:17:02 Frank Mehnert wrote:
> > On Tuesday 04 June 2013 16:02:30 Michal Hocko wrote:
> > > On Tue 04-06-13 14:14:45, Frank Mehnert wrote:
> > > > On Tuesday 04 June 2013 13:58:07 Robin Holt wro
On Wed 05-06-13 01:05:45, Tejun Heo wrote:
> Hey, Michal.
>
> On Wed, Jun 05, 2013 at 09:37:28AM +0200, Michal Hocko wrote:
> > Tejun, I do not have infinite amount of time and this is barely a
> > priority for the patchset. The core part is to be able to skip
> > n
On Wed 05-06-13 01:58:49, Tejun Heo wrote:
[...]
> Anyways, so you aren't gonna try the skipping thing?
As I said. I do not consider this a priority for the said reasons (i
will not repeat them).
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe
On Wed 05-06-13 10:34:13, Frank Mehnert wrote:
> On Wednesday 05 June 2013 09:54:54 Michal Hocko wrote:
> > On Tue 04-06-13 23:54:45, Frank Mehnert wrote:
> > > On Tuesday 04 June 2013 20:17:02 Frank Mehnert wrote:
> > > > On Tuesday 04 June 2013 16:02:30 Michal Hocko
On Tue 04-06-13 23:40:16, David Rientjes wrote:
> On Tue, 4 Jun 2013, Michal Hocko wrote:
>
> > > I'm not sure a userspace oom notifier would want to keep a
> > > preallocated buffer around that is mlocked in memory for all possible
> > > lengths of this fil
ld cover the migration part. Another potential problem could be
that the page might get unmapped and marked for the numa fault (see
do_numa_page). So maybe your code just assumes that the page even
doesn't get unmapped?
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line &q
On Wed 05-06-13 03:22:32, Frank Mehnert wrote:
> On Wednesday 05 June 2013 11:56:30 Michal Hocko wrote:
> > On Wed 05-06-13 11:32:15, Frank Mehnert wrote:
> > [...]
> >
> > > Thank you very much for your help. As I said, this problem happens _only_
> > > wi
On Tue 04-06-13 14:48:52, Johannes Weiner wrote:
> On Tue, Jun 04, 2013 at 11:17:49AM +0200, Michal Hocko wrote:
[...]
> > Now that I am looking at this again I've realized that this
> > is not correct. The task which triggers memcg OOM will not
> > have memcg_oom.memcg
= sequence position = iter->position
>
> However, the read side barrier is currently misplaced, which can lead
> to dereferencing stale position pointers that no longer point to valid
> memory. Fix this.
>
> Reported-by: Tejun Heo
> Signed-off-by: Johan
sively.
>
> Signed-off-by: Johannes Weiner
I like this
Acked-by: Michal Hocko
> ---
> mm/memcontrol.c | 86
> ++---
> 1 file changed, 57 insertions(+), 29 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontr
nt that's above that special last member below it,
> so it is more visible to somebody that considers appending to the
> struct mem_cgroup.
>
> Signed-off-by: Johannes Weiner
It would be better to make this a regular array in the long term. But
this is definitely an improvement
a hook that architecture page fault handlers are supposed to call
> to invoke the OOM killer and let it pick the right task to kill.
> Convert the remaining architectures over to this hook.
>
> To have the previous behavior of simply taking out the faulting task
> the vm.oom_kill_a
use some examples (e.g.
the i_mutex we have seen few months ago or a simple unkillable brk which
is hanging on mmap_sem for writing while a page fault is handled and
memcg oom triggered).
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel&qu
gt; Jun 5 17:27:10 alfa kernel: [3634217.398427] []
> vfs_read+0xb0/0x180
> Jun 5 17:27:10 alfa kernel: [3634217.398430] []
> sys_read+0x4a/0x90
> Jun 5 17:27:10 alfa kernel: [3634217.398434] [] ?
> do_device_not_available+0xe/0x10
> Jun 5 17:27:10 alfa kernel: [3634217.398438]
gt;
>
> Btw, something probably happened also at about 3:09 but i wasn't able to
> gather any data because my 'load check script' killed all apache processes
> (load was more than 100).
>
>
>
> azur
> --
> To unsubscribe from this list: send the li
On Thu 06-06-13 17:48:24, Tejun Heo wrote:
> On Wed, Jun 05, 2013 at 02:09:38AM -0700, Tejun Heo wrote:
> > On Wed, Jun 05, 2013 at 11:07:39AM +0200, Michal Hocko wrote:
> > > On Wed 05-06-13 01:58:49, Tejun Heo wrote:
> > > [...]
> > > > Anyways, so
On Mon 17-06-13 20:54:10, Glauber Costa wrote:
> On Mon, Jun 17, 2013 at 05:33:02PM +0200, Michal Hocko wrote:
[...]
> > I have seen some other traces as well (mentioning ext3 dput paths) but I
> > cannot reproduce them anymore.
> >
>
> Do you have those traces? If
XFS, is it? If it is , we have another user to
> look for)
No this is ext3. But I can try to test with xfs as well if it helps.
[...]
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel
> > > inode_lru_list_del(inode);
> > > spin_unlock(&inode->i_lock);
> > > list_add(&inode->i_lru, &dispose);
> > >
> > > IOW, they will remove the element from the LRU, and add it to
de->i_state &= ~I_WILL_FREE;
> }
>
> + inode_lru_list_del(inode);
> inode->i_state |= I_FREEING;
> - if (!list_empty(&inode->i_lru))
> - inode_lru_list_del(inode);
> spin_unlock(&inode->i_lock);
>
> evict(inode);
On Tue 18-06-13 12:21:33, Glauber Costa wrote:
> On Tue, Jun 18, 2013 at 10:19:31AM +0200, Michal Hocko wrote:
[...]
> > No this is ext3. But I can try to test with xfs as well if it helps.
> > [...]
>
> XFS won't help this, on the contrary. The reason I asked is becau
On Tue 18-06-13 10:24:14, Michal Hocko wrote:
> On Tue 18-06-13 10:31:05, Glauber Costa wrote:
> > On Tue, Jun 18, 2013 at 12:46:23PM +1000, Dave Chinner wrote:
> > > On Tue, Jun 18, 2013 at 02:30:05AM +0400, Glauber Costa wrote:
> > > > On Mon, Jun 17, 2013 at 0
teach mem_cgroup_soft_reclaim_eligible
to handle NULL memcg as mem_cgroup_root
Signed-off-by: Michal Hocko
---
mm/memcontrol.c | 6 +-
mm/vmscan.c | 4 +++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2a06d5a..b2e44d3 100644
--- a/mm/memcontrol.c
+++
zone
as pointed by Johannes
Signed-off-by: Michal Hocko
Reviewed-by: Glauber Costa
Reviewed-by: Tejun Heo
---
include/linux/memcontrol.h | 10 +--
mm/memcontrol.c| 161 ++---
mm/vmscan.c| 62 +
3 files ch
roup_soft_reclaim_eligible
Signed-off-by: Michal Hocko
Reviewed-by: Glauber Costa
Reviewed-by: Tejun Heo
---
include/linux/memcontrol.h | 6 --
mm/memcontrol.c| 14 +-
mm/vmscan.c| 4 ++--
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/in
track children in soft limit excess to
improve soft limit".
The series has seen quite some testing and I guess it is in the state to
be merged into mmotm and hopefully get into 3.11. I would like to hear
back from Johannes and Kamezawa about this timing though.
Shortlog says:
Michal Hocko
zation is necessary.
Signed-off-by: Michal Hocko
---
mm/memcontrol.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b2e44d3..d55e2c8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -942,9 +942,15 @@ static void mem_cgroup_update_soft_lim
ast give a chance to start the walk without a big risk of
reclaim latencies.
Signed-off-by: Michal Hocko
---
mm/vmscan.c | 19 +--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 56302da..8cbc8e5 100644
--- a/mm/vmscan.c
+++ b/m
well. System time is around 100%
(suprisingly better for the 8k case) and Elapsed is copies that trend.
Signed-off-by: Michal Hocko
---
mm/memcontrol.c | 71 +
1 file changed, 71 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
d but the tree walk continues down the tree or
the whole subtree of the current group is skipped.
Signed-off-by: Michal Hocko
---
include/linux/memcontrol.h | 48 +
mm/memcontrol.c| 77 --
mm/vmscan.c
Now that the soft limit is integrated to the reclaim directly the whole
soft-limit tree infrastructure is not needed anymore. Rip it out.
Changes v1->v2
- mem_cgroup_reclaimable is no longer used
- test_mem_cgroup_node_reclaimable is not used outside NUMA code
Signed-off-by: Michal Ho
path_lookupat+0x792/0x830
[] filename_lookup+0x33/0xd0
[] user_path_at_empty+0x7b/0xb0
[] user_path_at+0xc/0x10
[] vfs_fstatat+0x51/0xb0
[] vfs_stat+0x16/0x20
[] sys_newstat+0x1f/0x50
[] system_call_fastpath+0x16/0x1b
[] 0x
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: se
a0
[] do_sys_open+0x160/0x1e0
[] sys_open+0x1c/0x20
[] system_call_fastpath+0x16/0x1b
[] 0x
23393 [] do_last+0x2c4/0x780
[] path_openat+0xda/0x400
[] do_filp_open+0x43/0xa0
[] do_sys_open+0x160/0x1e0
[] sys_open+0x1c/0x20
[] system_call_fastpath+0x16/0x1b
[] 0x
--
Michal Hoc
On Tue 18-06-13 15:01:21, Johannes Weiner wrote:
> On Tue, Jun 18, 2013 at 02:09:39PM +0200, Michal Hocko wrote:
> > My primary test case was a parallel kernel build with 2 groups (make
> > is running with -j4 with a distribution .config in a separate cgroup
> > without any h
s patch?
None yet. But I hope it will be merged to 3.11 and backported to the
stable trees.
> Thank you and everyone involved very much for time and help.
>
> azur
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
but maybe this could benefit from
> a broader fs look
>
> In any case, the patch we suggested is obviously correct and we should
> apply nevertheless. I will write it down and send it to Andrew.
OK, feel free to stick my Tested-by there.
--
Michal Hocko
SUSE Labs
--
To unsubscri
On Wed 19-06-13 09:13:46, Michal Hocko wrote:
> On Tue 18-06-13 10:26:24, Glauber Costa wrote:
> [...]
> > Michal, would you mind testing the following patch?
> >
> > diff --git a/fs/inode.c b/fs/inode.c
> > index 00b804e..48eafa6 100644
> > --- a/fs/inode.
On Thu 20-06-13 17:12:01, Michal Hocko wrote:
> On Thu 20-06-13 18:11:38, Glauber Costa wrote:
> [...]
> > > [84091.219056] [ cut here ]
> > > [84091.220015] kernel BUG at mm/list_lru.c:42!
> > > [84091.220015] invalid opcode: [#1]
On Mon 27-05-13 19:13:08, Michal Hocko wrote:
[...]
> Nevertheless I have encountered an issue while testing the huge number
> of groups scenario. And the issue is not limitted to only to this
> scenario unfortunately. As memcg iterators use per node-zone-priority
> cache to preve
On Mon 27-05-13 19:13:08, Michal Hocko wrote:
[...]
> > I think that the numbers can be improved even without introducing
> > the list of groups in excess. One way to go could be introducing a
> > conditional (callback) to the memcg iterator so the groups under the
> >
On Wed 29-05-13 15:05:38, Michal Hocko wrote:
> On Mon 27-05-13 19:13:08, Michal Hocko wrote:
> [...]
> > Nevertheless I have encountered an issue while testing the huge number
> > of groups scenario. And the issue is not limitted to only to this
> > scenario unfortunately
On Wed 22-05-13 12:49:37, Michal Hocko wrote:
> On Wed 22-05-13 17:29:27, Wanpeng Li wrote:
> > Logic memory-remove code fails to correctly account the Total High Memory
> > when a memory block which contains High Memory is offlined as shown in the
> > example below. The fol
On Wed 29-05-13 16:54:00, Michal Hocko wrote:
[...]
> I am still running kbuild tests with the same configuration to see a
> more general workload.
And here we go with the kbuild numbers. Same configuration (mem=1G, one
group for kernel build - it is actually expand the three + build a
On Wed 29-05-13 16:01:54, Johannes Weiner wrote:
> On Wed, May 29, 2013 at 05:57:56PM +0200, Michal Hocko wrote:
> > On Wed 29-05-13 15:05:38, Michal Hocko wrote:
> > > On Mon 27-05-13 19:13:08, Michal Hocko wrote:
> > > [...]
> > > > Nevertheless I have enco
lowing the kernel to handle the
> condition.
I am not sure I like the idea. How does an admin decide what is the right value
of the timeout? And why he cannot use userspace oom handler to do the same
thing?
[...]
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "un
On Thu 30-05-13 14:48:52, Tejun Heo wrote:
> Hello,
>
> Sorry about the delay. Have been and still am traveling.
>
> On Fri, May 24, 2013 at 09:54:20AM +0200, Michal Hocko wrote:
> > > > On Fri, May 17, 2013 at 03:04:06PM +0800, Li Zefan wrote:
> > > >&g
On Thu 30-05-13 13:47:30, David Rientjes wrote:
> On Thu, 30 May 2013, Michal Hocko wrote:
>
> > > Completely disabling the oom killer for a memcg is problematic if
> > > userspace is unable to address the condition itself, usually because it
> > > is unresponsi
On Fri 31-05-13 03:22:59, David Rientjes wrote:
> On Fri, 31 May 2013, Michal Hocko wrote:
[...]
> > > If the oom notifier is in the oom cgroup, it may not be able to
> > > successfully read the memcg "tasks" file to even determine the set of
> > &g
On Fri 31-05-13 03:22:59, David Rientjes wrote:
> On Fri, 31 May 2013, Michal Hocko wrote:
>
> > I have always discouraged people from running oom handler in the same
> > memcg (or even in the same hierarchy).
> >
>
> We allow users to control their own memcgs
On Fri 31-05-13 12:29:17, David Rientjes wrote:
> On Fri, 31 May 2013, Michal Hocko wrote:
>
> > > We allow users to control their own memcgs by chowning them, so they must
> > > be run in the same hierarchy if they want to run their own userspace oom
> > >
but it doesn't
sounds entirely crazy. Well, we would have to drop mmap_sem so things
have to be rechecked but we are doing that already with VM_FAULT_RETRY
in some archs so it should work.
> Patch only works on x86 as of now, on other architectures memcg OOM
> will invoke the global OO
Now that the soft limit is integrated to the reclaim directly the whole
soft-limit tree infrastructure is not needed anymore. Rip it out.
Changes v1->v2
- mem_cgroup_reclaimable is no longer used
- test_mem_cgroup_node_reclaimable is not used outside NUMA code
Signed-off-by: Michal Ho
d but the tree walk continues down the tree or
the whole subtree of the current group is skipped.
TODO is it correct to reclaim + cond together? What if the cache simply
skips interesting nodes which another predicate would find interesting?
Signed-off-by: Michal Hocko
---
include/linux/memcont
teach mem_cgroup_soft_reclaim_eligible
to handle NULL memcg as mem_cgroup_root
Signed-off-by: Michal Hocko
---
mm/memcontrol.c |6 +-
mm/vmscan.c |4 +++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 90495d5..8ff9366 100644
--- a/mm/memcont
ast give a chance to start the walk without a big risk of
reclaim latencies.
Signed-off-by: Michal Hocko
---
mm/vmscan.c | 19 +--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index dc78f07..72b428d 100644
--- a/mm/vmscan.c
+++ b/m
zation is necessary.
Signed-off-by: Michal Hocko
---
mm/memcontrol.c |9 +
1 file changed, 9 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8ff9366..91740f7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -932,9 +932,15 @@ static void mem_cgroup_update_s
"memcg: track children in soft limit excess to
improve soft limit".
The series has seen quite some testing and I guess it is in the state to
be merged into mmotm and hopefully get into 3.11. I would like to hear
back from Johannes and Kamezawa about this timing though.
Shortlog says:
Mi
well. System time is around 100%
(suprisingly better for the 8k case) and Elapsed is copies that trend.
Signed-off-by: Michal Hocko
---
mm/memcontrol.c | 51 +++
1 file changed, 51 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index da
roup_soft_reclaim_eligible
Signed-off-by: Michal Hocko
Reviewed-by: Glauber Costa
Reviewed-by: Tejun Heo
---
include/linux/memcontrol.h |6 --
mm/memcontrol.c| 14 +-
mm/vmscan.c|4 ++--
3 files changed, 15 insertions(+), 9 deletions(-)
zone
as pointed by Johannes
Signed-off-by: Michal Hocko
Reviewed-by: Glauber Costa
Reviewed-by: Tejun Heo
---
include/linux/memcontrol.h | 10 +--
mm/memcontrol.c| 161 ++--
mm/vmscan.c| 62 ++---
3 files ch
ot; --
arch/powerpc/mm/hugetlbpage.c: spin_lock(&mm->page_table_lock);
arch/powerpc/mm/hugetlbpage.c: spin_unlock(&mm->page_table_lock);
arch/tile/mm/hugetlbpage.c: spin_lock(&mm->page_table_lock);
arch/tile/mm/hugetlbpage.c:
spin_unlock(&mm->page_table_lock);
ld be as simple as
possible especially when it goes to the stable.
Without 1/2 dependency
Reviewed-by: Michal Hocko
> ---
> include/linux/swapops.h | 3 +++
> mm/hugetlb.c| 2 +-
> mm/migrate.c| 23 ++-
> 3 files changed, 22 insertions(+), 6
(!oom) {
> + if (!oom || oom_check) {
OK, this allows us to remove the confusing nr_oom_retries =
MEM_CGROUP_RECLAIM_RETRIES
from the branch where oom_check is set to true
> css_put(&memcg->css);
> goto no
On Mon 03-06-13 10:34:35, Naoya Horiguchi wrote:
> On Mon, Jun 03, 2013 at 03:19:32PM +0200, Michal Hocko wrote:
> > On Tue 28-05-13 15:52:50, Naoya Horiguchi wrote:
> > > Currently all of page table handling by hugetlbfs code are done under
> > > mm->page_table_loc
prepare_to_wait
atomic_read(&memcg->oom_wakeups)
atomic_inc(oom_wakeups)
I guess we want atomic_inc before __wake_up, right?
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body o
On Mon 03-06-13 12:48:39, Johannes Weiner wrote:
> On Mon, Jun 03, 2013 at 05:34:32PM +0200, Michal Hocko wrote:
> > On Sat 01-06-13 02:11:51, Johannes Weiner wrote:
> > [...]
> > > From: Johannes Weiner
> > > Subject: [PATCH] memcg: more robust oom handling
>
On Mon 03-06-13 11:18:09, David Rientjes wrote:
> On Sat, 1 Jun 2013, Michal Hocko wrote:
[...]
> > I still do not see why you cannot simply read tasks file into a
> > preallocated buffer. This would be few pages even for thousands of pids.
> > You do not have to track proce
On Thu 20-06-13 17:12:01, Michal Hocko wrote:
> I am bisecting it again. It is quite tedious, though, because good case
> is hard to be sure about.
OK, so now I converged to 2d4fc052 (inode: convert inode lru list to generic lru
list code.) in my tree and I have double checked it matches w
Correct me if I'm wrong.
I haven't checked the code closer but the patch doesn't change anything
regarding pte_unmap_unlock. The only thing that it touches is the
_locking_. So whether there ever was a problem with kmap or not this
patch doesn't change it.
That being said, the pat
On Thu 20-06-13 12:12:06, Mel Gorman wrote:
> On Tue, Jun 18, 2013 at 02:09:39PM +0200, Michal Hocko wrote:
> > base is mmotm-2013-05-09-15-57
> > baserebase is mmotm-2013-06-05-17-24-63 + patches from the current mmots
> > without slab shrinkers patchset.
> > reworkreba
On Fri 21-06-13 16:06:27, Michal Hocko wrote:
[...]
> > Can you try this monolithic patch please?
>
> Wow, this looks much better!
Damn it! Scratch that. I have made a mistake in configuration so this
all has been 0-no-limit in fact. Sorry about that. It's only now that
I'v
501 - 600 of 11648 matches
Mail list logo