On Wed, 26 Jul 2017, Andrea Arcangeli wrote:
> From 3d9001490ee1a71f39c7bfaf19e96821f9d3ff16 Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli
> Date: Tue, 25 Jul 2017 20:02:27 +0200
> Subject: [PATCH 1/1] mm: oom: let oom_reap_task and exit_mmap to run
> concurrently
>
> This is purely require
On Thu 10-08-17 20:51:38, Michal Hocko wrote:
[...]
> OK, let's agree to disagree. As I've said I like when the critical
> section is explicit and we _know_ what it protects. In this case it is
> clear that we have to protect from the page tables tear down and the
> vma destructions. But as I've sa
On Thu 10-08-17 20:05:54, Andrea Arcangeli wrote:
> On Thu, Aug 10, 2017 at 10:16:32AM +0200, Michal Hocko wrote:
> > Andrea has proposed and alternative solution [4] which should be
> > equivalent functionally similar to {ksm,khugepaged}_exit. I have to
> > confess I really don't like that approac
On Thu, Aug 10, 2017 at 10:16:32AM +0200, Michal Hocko wrote:
> Andrea has proposed and alternative solution [4] which should be
> equivalent functionally similar to {ksm,khugepaged}_exit. I have to
> confess I really don't like that approach but I can live with it if
> that is a preferred way (to
On Thu 27-07-17 16:55:59, Andrea Arcangeli wrote:
> On Thu, Jul 27, 2017 at 08:50:24AM +0200, Michal Hocko wrote:
> > Yes this will work and it won't depend on the oom_lock. But isn't it
> > just more ugly than simply doing
> >
> > if (tsk_is_oom_victim) {
> > down_write(&mm->mmap_
On Thu, Jul 27, 2017 at 08:50:24AM +0200, Michal Hocko wrote:
> Yes this will work and it won't depend on the oom_lock. But isn't it
> just more ugly than simply doing
>
> if (tsk_is_oom_victim) {
> down_write(&mm->mmap_sem);
> locked = true;
> }
> fre
On Thu 27-07-17 13:59:09, Manish Jaggi wrote:
[...]
> With 4.11.6 I was getting random kernel panics (Out of memory - No process
> left to kill),
> when running LTP oom01 /oom02 ltp tests on our arm64 hardware with ~256G
> memory and high core count.
> The issue experienced was as follows
>
Hi Michal,
On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
From: Michal Hocko
David has noticed that the oom killer might kill additional tasks while
the exiting oom victim hasn't terminated yet because the oom_reaper marks
the curent victim MMF_OOM_SKIP too early when mm->mm_use
On Wed 26-07-17 18:29:12, Andrea Arcangeli wrote:
> On Wed, Jul 26, 2017 at 07:45:57AM +0200, Michal Hocko wrote:
> > On Tue 25-07-17 21:19:52, Andrea Arcangeli wrote:
> > > On Tue, Jul 25, 2017 at 06:04:00PM +0200, Michal Hocko wrote:
> > > > - down_write(&mm->mmap_sem);
> > > > + if (
On Wed 26-07-17 18:39:28, Andrea Arcangeli wrote:
> On Wed, Jul 26, 2017 at 07:45:33AM +0200, Michal Hocko wrote:
> > Yes, exit_aio is the only blocking call I know of currently. But I would
> > like this to be as robust as possible and so I do not want to rely on
> > the current implementation. Th
On Wed, Jul 26, 2017 at 06:29:12PM +0200, Andrea Arcangeli wrote:
> From 3d9001490ee1a71f39c7bfaf19e96821f9d3ff16 Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli
> Date: Tue, 25 Jul 2017 20:02:27 +0200
> Subject: [PATCH 1/1] mm: oom: let oom_reap_task and exit_mmap to run
> concurrently
This n
On Wed, Jul 26, 2017 at 07:45:33AM +0200, Michal Hocko wrote:
> Yes, exit_aio is the only blocking call I know of currently. But I would
> like this to be as robust as possible and so I do not want to rely on
> the current implementation. This can change in future and I can
> guarantee that nobody
On Wed, Jul 26, 2017 at 07:45:57AM +0200, Michal Hocko wrote:
> On Tue 25-07-17 21:19:52, Andrea Arcangeli wrote:
> > On Tue, Jul 25, 2017 at 06:04:00PM +0200, Michal Hocko wrote:
> > > - down_write(&mm->mmap_sem);
> > > + if (tsk_is_oom_victim(current))
> > > + down_write(&mm->mmap_sem);
>
On Tue 25-07-17 21:19:52, Andrea Arcangeli wrote:
> On Tue, Jul 25, 2017 at 06:04:00PM +0200, Michal Hocko wrote:
> > - down_write(&mm->mmap_sem);
> > + if (tsk_is_oom_victim(current))
> > + down_write(&mm->mmap_sem);
> > free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_
On Tue 25-07-17 20:26:19, Andrea Arcangeli wrote:
> On Tue, Jul 25, 2017 at 05:45:14PM +0200, Michal Hocko wrote:
> > That problem is real though as reported by David.
>
> I'm not against fixing it, I just think it's not a major concern, and
> the solution doesn't seem optimal as measured by Kiril
On Tue, Jul 25, 2017 at 06:04:00PM +0200, Michal Hocko wrote:
> - down_write(&mm->mmap_sem);
> + if (tsk_is_oom_victim(current))
> + down_write(&mm->mmap_sem);
> free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
> tlb_finish_mmu(&tlb, 0, -1);
>
>
On Tue, Jul 25, 2017 at 05:45:14PM +0200, Michal Hocko wrote:
> That problem is real though as reported by David.
I'm not against fixing it, I just think it's not a major concern, and
the solution doesn't seem optimal as measured by Kirill.
I'm just skeptical it's the best to solve that tiny race
On Tue 25-07-17 18:31:10, Kirill A. Shutemov wrote:
> On Tue, Jul 25, 2017 at 05:23:00PM +0200, Michal Hocko wrote:
> > what is stdev?
>
> Updated tables:
>
> 3 runs before the patch:
>Min. 1st Qu. MedianMean 3rd Qu.Max. Stdev
> 177200 205000 212900 217800 223700 2377000 32868
On Tue 25-07-17 17:26:39, Andrea Arcangeli wrote:
> On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> > From: Michal Hocko
> >
> > David has noticed that the oom killer might kill additional tasks while
> > the exiting oom victim hasn't terminated yet because the oom_reaper marks
>
On Tue, Jul 25, 2017 at 05:23:00PM +0200, Michal Hocko wrote:
> what is stdev?
Updated tables:
3 runs before the patch:
Min. 1st Qu. MedianMean 3rd Qu.Max. Stdev
177200 205000 212900 217800 223700 2377000 32868
172400 201700 209700 214300 220600 1343000 31191
175700 203
On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> From: Michal Hocko
>
> David has noticed that the oom killer might kill additional tasks while
> the exiting oom victim hasn't terminated yet because the oom_reaper marks
> the curent victim MMF_OOM_SKIP too early when mm->mm_users d
On Tue 25-07-17 18:17:54, Kirill A. Shutemov wrote:
> > before the patch
> > min: 306300.00 max: 6731916.00 avg: 437962.07 std: 92898.30 nr: 10
> >
> > after
> > min: 303196.00 max: 5728080.00 avg: 436081.87 std: 96165.98 nr: 10
> >
> > The results are well withing noise as I would expect
On Tue, Jul 25, 2017 at 04:26:26PM +0200, Michal Hocko wrote:
> On Mon 24-07-17 18:11:46, Michal Hocko wrote:
> > On Mon 24-07-17 17:51:42, Kirill A. Shutemov wrote:
> > > On Mon, Jul 24, 2017 at 04:15:26PM +0200, Michal Hocko wrote:
> > [...]
> > > > What kind of scalability implication you have i
On Tue 25-07-17 18:07:19, Kirill A. Shutemov wrote:
> On Tue, Jul 25, 2017 at 04:26:17PM +0200, Michal Hocko wrote:
[...]
> > Thanks for retesting Kirill. Are those numbers stable over runs? E.g.
> > the run without the patch has ~3% variance while the one with the patch
> > has it smaller. This so
On Tue, Jul 25, 2017 at 04:26:17PM +0200, Michal Hocko wrote:
> On Tue 25-07-17 17:17:23, Kirill A. Shutemov wrote:
> [...]
> > Below are numbers for the same test case, but from bigger machine (48
> > threads, 64GiB of RAM).
> >
> > v4.13-rc2:
> >
> > Performance counter stats for './a.sh 1
On Tue 25-07-17 17:17:23, Kirill A. Shutemov wrote:
[...]
> Below are numbers for the same test case, but from bigger machine (48
> threads, 64GiB of RAM).
>
> v4.13-rc2:
>
> Performance counter stats for './a.sh 10' (5 runs):
>
> 159857.233790 task-clock:u (msec) #1.000
On Mon 24-07-17 18:11:46, Michal Hocko wrote:
> On Mon 24-07-17 17:51:42, Kirill A. Shutemov wrote:
> > On Mon, Jul 24, 2017 at 04:15:26PM +0200, Michal Hocko wrote:
> [...]
> > > What kind of scalability implication you have in mind? There is
> > > basically a zero contention on the mmap_sem that
On Mon, Jul 24, 2017 at 06:11:47PM +0200, Michal Hocko wrote:
> On Mon 24-07-17 17:51:42, Kirill A. Shutemov wrote:
> > On Mon, Jul 24, 2017 at 04:15:26PM +0200, Michal Hocko wrote:
> [...]
> > > What kind of scalability implication you have in mind? There is
> > > basically a zero contention on th
On Tue 25-07-17 00:42:05, kbuild test robot wrote:
> Hi Michal,
>
> [auto build test ERROR on mmotm/master]
> [also build test ERROR on v4.13-rc2 next-20170724]
> [if your patch is applied to the wrong git tree, please drop us a note to
> help improve the system]
>
> url:
> https://github.co
Hi Michal,
[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.13-rc2 next-20170724]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system]
url:
https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-oom-allow-oom-reaper-to-race-
On Mon 24-07-17 17:51:42, Kirill A. Shutemov wrote:
> On Mon, Jul 24, 2017 at 04:15:26PM +0200, Michal Hocko wrote:
[...]
> > What kind of scalability implication you have in mind? There is
> > basically a zero contention on the mmap_sem that late in the exit path
> > so this should be pretty much
Dohh, the full conflict resolution didn't make it into the commit. The
full patch is below
---
>From be69b355a56649167dce901d24c2296ef3a3f7ec Mon Sep 17 00:00:00 2001
From: Michal Hocko
Date: Mon, 24 Jul 2017 09:23:32 +0200
Subject: [PATCH] mm, oom: allow oom reaper to race with exit_mmap
David h
On Mon, Jul 24, 2017 at 04:15:26PM +0200, Michal Hocko wrote:
> On Mon 24-07-17 17:00:08, Kirill A. Shutemov wrote:
> > On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> > > From: Michal Hocko
> > >
> > > David has noticed that the oom killer might kill additional tasks while
> > >
On Mon 24-07-17 17:00:08, Kirill A. Shutemov wrote:
> On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> > From: Michal Hocko
> >
> > David has noticed that the oom killer might kill additional tasks while
> > the exiting oom victim hasn't terminated yet because the oom_reaper marks
On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> From: Michal Hocko
>
> David has noticed that the oom killer might kill additional tasks while
> the exiting oom victim hasn't terminated yet because the oom_reaper marks
> the curent victim MMF_OOM_SKIP too early when mm->mm_users d
35 matches
Mail list logo