Re: [PATCH] Fix race between oom kill and task exit

2013-11-28 Thread azurIt
> Od: Johannes Weiner > Komu: "Ma, Xindong" > Dátum: 28.11.2013 07:54 > Predmet: Re: [PATCH] Fix race between oom kill and task exit > > CC: "a...@linux-foundation.org" , "mho...@suse.cz" > , "rient...@google.com" , > "ru...@rustcorp.com.au" , "linux...@kvack.org" > , "linux-kernel@vger.kernel

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-10-10 Thread azurIt
>On Wed, Oct 09, 2013 at 08:44:50PM +0200, azurIt wrote: >> Joahnnes, >> >> i'm very sorry to say it but today something strange happened.. :) i was >> just right at the computer so i noticed it almost immediately but i don't >> have much info. Serv

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-10-09 Thread azurIt
>Hi azur, > >On Mon, Oct 07, 2013 at 01:01:49PM +0200, azurIt wrote: >> >On Thu, Sep 26, 2013 at 06:54:59PM +0200, azurIt wrote: >> >> On Wed, Sep 18, 2013 at 02:19:46PM -0400, Johannes Weiner wrote: >> >> >Here is an update. Full replacement on top o

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-10-07 Thread azurIt
>On Thu, Sep 26, 2013 at 06:54:59PM +0200, azurIt wrote: >> On Wed, Sep 18, 2013 at 02:19:46PM -0400, Johannes Weiner wrote: >> >Here is an update. Full replacement on top of 3.2 since we tried a >> >dead end and it would be more painful to revert individual changes.

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-26 Thread azurIt
> CC: "Michal Hocko" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.kern

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-26 Thread azurIt
ernel.org >On Wed, Sep 18, 2013 at 02:19:46PM -0400, Johannes Weiner wrote: >> On Wed, Sep 18, 2013 at 02:04:55PM -0400, Johannes Weiner wrote: >> > On Wed, Sep 18, 2013 at 04:03:04PM +0200, azurIt wrote: >> > > > CC: "Johannes Weiner" , "Andrew Mor

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-25 Thread azurIt
ernel.org >On Wed, Sep 18, 2013 at 02:19:46PM -0400, Johannes Weiner wrote: >> On Wed, Sep 18, 2013 at 02:04:55PM -0400, Johannes Weiner wrote: >> > On Wed, Sep 18, 2013 at 04:03:04PM +0200, azurIt wrote: >> > > > CC: "Johannes Weiner" , "Andrew Mor

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread azurIt
ernel.org >On Wed, Sep 18, 2013 at 02:19:46PM -0400, Johannes Weiner wrote: >> On Wed, Sep 18, 2013 at 02:04:55PM -0400, Johannes Weiner wrote: >> > On Wed, Sep 18, 2013 at 04:03:04PM +0200, azurIt wrote: >> > > > CC: "Johannes Weiner" , "Andrew Mor

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.ke

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vge

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@v

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-17 Thread azurIt
> CC: "Michal Hocko" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vge

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-17 Thread azurIt
__ > Od: Johannes Weiner > Komu: azurIt > Dátum: 17.09.2013 02:02 > Predmet: Re: [patch 0/7] improve memcg oom killer robustness v2 > > CC: "Michal Hocko" , "Andrew Morton" > , "Davi

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.ke

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.ke

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Michal Hocko" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vge

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.ke

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.ke

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Johannes Weiner" , "Andrew Morton" > , "David Rientjes" , > "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vger.k

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-16 Thread azurIt
> CC: "Andrew Morton" , "Michal Hocko" > , "David Rientjes" , "KAMEZAWA Hiroyuki" > , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vge

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-14 Thread azurIt
> CC: "Andrew Morton" , "Michal Hocko" > , "David Rientjes" , "KAMEZAWA Hiroyuki" > , "KOSAKI Motohiro" > , linux...@kvack.org, > cgro...@vger.kernel.org, x...@kernel.org, linux-a...@vger.kernel.org, > linux-kernel@vge

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-11 Thread azurIt
>On Wed, Sep 11, 2013 at 08:54:48PM +0200, azurIt wrote: >> >On Wed, Sep 11, 2013 at 02:33:05PM +0200, azurIt wrote: >> >> >On Tue, Sep 10, 2013 at 11:32:47PM +0200, azurIt wrote: >> >> >> >On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote:

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-11 Thread azurIt
>On Wed, Sep 11, 2013 at 02:33:05PM +0200, azurIt wrote: >> >On Tue, Sep 10, 2013 at 11:32:47PM +0200, azurIt wrote: >> >> >On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote: >> >> >> >On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote: >

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-11 Thread azurIt
>On Tue, Sep 10, 2013 at 11:32:47PM +0200, azurIt wrote: >> >On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote: >> >> >On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote: >> >> >> Here is full kernel log between 6:00 and 7:59: >> >>

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-10 Thread azurIt
>On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote: >> >On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote: >> >> >On Tue, Sep 10, 2013 at 08:13:59PM +0200, azurIt wrote: >> >> >> >On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote:

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-10 Thread azurIt
>On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote: >> >On Tue, Sep 10, 2013 at 08:13:59PM +0200, azurIt wrote: >> >> >On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote: >> >> >> >On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-10 Thread azurIt
>On Tue, Sep 10, 2013 at 08:13:59PM +0200, azurIt wrote: >> >On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote: >> >> >On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >> >> >> >Hi azur, >> >> >> > >> >

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-10 Thread azurIt
>On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote: >> >On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >> >> >Hi azur, >> >> > >> >> >On Wed, Sep 04, 2013 at 10:18:52AM +0200, azurIt wrote: >> >> >> > C

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread azurIt
>On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote: >> >On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >> >> >Hi azur, >> >> > >> >> >On Wed, Sep 04, 2013 at 10:18:52AM +0200, azurIt wrote: >> >> >> > C

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread azurIt
>On Mon, Sep 09, 2013 at 09:59:17PM +0200, azurIt wrote: >> >On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >> >> >Hi azur, >> >> > >> >> >On Wed, Sep 04, 2013 at 10:18:52AM +0200, azurIt wrote: >> >> >> > C

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread azurIt
>On Mon, Sep 09, 2013 at 03:10:10PM +0200, azurIt wrote: >> >Hi azur, >> > >> >On Wed, Sep 04, 2013 at 10:18:52AM +0200, azurIt wrote: >> >> > CC: "Andrew Morton" , "Michal Hocko" >> >> > , "David Rient

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread azurIt
>Hi azur, > >On Wed, Sep 04, 2013 at 10:18:52AM +0200, azurIt wrote: >> > CC: "Andrew Morton" , "Michal Hocko" >> > , "David Rientjes" , "KAMEZAWA >> > Hiroyuki" , "KOSAKI Motohiro" >> > ,

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread azurIt
>On Thu 05-09-13 14:33:43, azurIt wrote: >[...] >> >Just to be sure I got you right. You have killed all the processes from >> >the group you have sent stacks for, right? If that is the case I am >> >really curious about processes sitting in sleep_on_page_killable

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread azurIt
>On Thu 05-09-13 13:47:02, azurIt wrote: >> >On Thu 05-09-13 12:17:00, azurIt wrote: >> >> >[...] >> >> >> My script detected another freezed cgroup today, sending stacks. Is >> >> >> there anything interesting? >> >>

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread azurIt
>On Thu 05-09-13 12:17:00, azurIt wrote: >> >[...] >> >> My script detected another freezed cgroup today, sending stacks. Is >> >> there anything interesting? >> > >> >3 tasks are sleeping and waiting for somebody to take an action to >>

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread azurIt
>[...] >> My script detected another freezed cgroup today, sending stacks. Is >> there anything interesting? > >3 tasks are sleeping and waiting for somebody to take an action to >resolve memcg OOM. The memcg oom killer is enabled for that group? If >yes, which task has been selected to be killed?

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread azurIt
>> >[...] >> >> My script has just detected (and killed) another freezed cgroup. I >> >> must say that i'm not 100% sure that cgroup was really freezed but it >> >> has 99% or more memory usage for at least 30 seconds (well, or it has >> >> 99% memory usage in both two cases the script was checking

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread azurIt
>> >[...] >> >> My script has just detected (and killed) another freezed cgroup. I >> >> must say that i'm not 100% sure that cgroup was really freezed but it >> >> has 99% or more memory usage for at least 30 seconds (well, or it has >> >> 99% memory usage in both two cases the script was checking

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread azurIt
>[...] >> My script has just detected (and killed) another freezed cgroup. I >> must say that i'm not 100% sure that cgroup was really freezed but it >> has 99% or more memory usage for at least 30 seconds (well, or it has >> 99% memory usage in both two cases the script was checking it). Here >> a

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread azurIt
>Hello azur, > >On Mon, Sep 02, 2013 at 12:38:02PM +0200, azurIt wrote: >> >>Hi azur, >> >> >> >>here is the x86-only rollup of the series for 3.2. >> >> >> >>Thanks! >> >>Johannes >> >>--- >> >

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread azurIt
.kernel.org >Hello azur, > >On Mon, Sep 02, 2013 at 12:38:02PM +0200, azurIt wrote: >> >>Hi azur, >> >> >> >>here is the x86-only rollup of the series for 3.2. >> >> >> >>Thanks! >> >>Johannes >> >>--- >> &g

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread azurIt
>On Mon, Sep 02, 2013 at 12:38:02PM +0200, azurIt wrote: >> >>Hi azur, >> >> >> >>here is the x86-only rollup of the series for 3.2. >> >> >> >>Thanks! >> >>Johannes >> >>--- >> > >> > >

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-02 Thread azurIt
>>Hi azur, >> >>here is the x86-only rollup of the series for 3.2. >> >>Thanks! >>Johannes >>--- > > >Johannes, > >unfortunately, one problem arises: I have (again) cgroup which cannot be >deleted :( it's a user who had very high memory usage and was reaching his >limit very often. Do you need an

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-08-30 Thread azurIt
>Hi azur, > >here is the x86-only rollup of the series for 3.2. > >Thanks! >Johannes >--- Johannes, unfortunately, one problem arises: I have (again) cgroup which cannot be deleted :( it's a user who had very high memory usage and was reaching his limit very often. Do you need any info which i

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-08-09 Thread azurIt
>Hi azur, > >here is the x86-only rollup of the series for 3.2. > >Thanks! >Johannes Hi Johannes, i'm running kernel with this new patch for 1 day now without any problems! Will report back in few weeks or months or in case of any problems occures. Thank you! azur -- To unsubscribe from this

Re: [patch 3.2] memcg OOM robustness (x86 only)

2013-08-03 Thread azurIt
>azurIt, this is the combined backport for 3.2, x86 + generic bits + >debugging. It would be fantastic if you could give this another shot >once you get back from vacation. Thanks! > >Johannes Hi Johannes, is this still up to date? Thank you. azur -- To unsubscribe from this

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-19 Thread azurIt
o wrote: >> > On Tue 16-07-13 11:35:44, Johannes Weiner wrote: >> > > On Mon, Jul 15, 2013 at 06:00:06PM +0200, Michal Hocko wrote: >> > > > On Mon 15-07-13 17:41:19, Michal Hocko wrote: >> > > > > On Sun 14-07-13 01:51:12, azurIt wrote: >> &g

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-14 Thread azurIt
> CC: "Michal Hocko" , linux-kernel@vger.kernel.org, > linux...@kvack.org, "cgroups mailinglist" , > "KAMEZAWA Hiroyuki" >On Fri, Jul 05, 2013 at 09:02:46PM +0200, azurIt wrote: >> >I looked at your debug messages but could not find anythin

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-13 Thread azurIt
, "cgroups mailinglist" , >> "KAMEZAWA Hiroyuki" , righi.and...@gmail.com >>On Wed 10-07-13 18:25:06, azurIt wrote: >>> >> Now i realized that i forgot to remove UID from that cgroup before >>> >> trying to remove it, so cgroup cannot be remove

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-13 Thread azurIt
> CC: "Johannes Weiner" , linux-kernel@vger.kernel.org, > linux...@kvack.org, "cgroups mailinglist" , > "KAMEZAWA Hiroyuki" , righi.and...@gmail.com >On Wed 10-07-13 18:25:06, azurIt wrote: >> >> Now i realized that i forgot to remove UID

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-10 Thread azurIt
>> Now i realized that i forgot to remove UID from that cgroup before >> trying to remove it, so cgroup cannot be removed anyway (we are using >> third party cgroup called cgroup-uid from Andrea Righi, which is able >> to associate all user's processes with target cgroup). Look here for >> cgroup-u

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-09 Thread azurIt
>On Mon 08-07-13 01:42:24, azurIt wrote: >> > CC: "Michal Hocko" , linux-kernel@vger.kernel.org, >> > linux...@kvack.org, "cgroups mailinglist" , >> > "KAMEZAWA Hiroyuki" >> >On Fri, Jul 05, 2013 at 09:02:46PM +0200, az

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-07 Thread azurIt
> CC: "Michal Hocko" , linux-kernel@vger.kernel.org, > linux...@kvack.org, "cgroups mailinglist" , > "KAMEZAWA Hiroyuki" >On Fri, Jul 05, 2013 at 09:02:46PM +0200, azurIt wrote: >> >I looked at your debug messages but could not find anythin

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-07-05 Thread azurIt
>I looked at your debug messages but could not find anything that would >hint at a deadlock. All tasks are stuck in the refrigerator, so I >assume you use the freezer cgroup and enabled it somehow? Yes, i'm really using freezer cgroup BUT i was checking if it's not doing problems - unfortunatel

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-06-28 Thread azurIt
>It's not a kernel thread that does it because all kernel-context >handle_mm_fault() are annotated properly, which means the task must be >userspace and, since tasks is empty, have exited before synchronizing. > >Can you try with the following patch on top? Michal and Johannes, i have some obser

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-06-24 Thread azurIt
>I would be really interesting to see what those tasks are blocked on. Ok, i got it! Problem occurs two times and it behaves differently each time, I was running kernel with that latest patch. 1.) It doesn't have impact on the whole server, only on one cgroup. Here are stacks: http://watchdog.

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-06-22 Thread azurIt
Michal, >> I'm unable to send you stacks or more info because problem is taking >> down the whole server for some time now (don't know what exactly >> caused it to start happening, maybe newer versions of 3.2.x). > >So you are not testing with the same kernel with just the old patch >replaced by

Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

2013-06-17 Thread azurIt
>Here we go. I hope I didn't screw anything (Johannes might double check) >because there were quite some changes in the area since 3.2. Nothing >earth shattering though. Please note that I have only compile tested >this. Also make sure you remove the previous patches you have from me. Hi Michal,

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-06-06 Thread azurIt
Hello Michal, nice to read you! :) Yes, i'm still on 3.2. Could you be so kind and try to backport it? Thank you very much! azur __ > Od: "Michal Hocko" > Komu: azurIt > Dátum: 06.06.2013 18:04 > Pre

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-22 Thread azurIt
>I am not sure how much time I'll have for this today but just to make >sure we are on the same page, could you point me to the two patches you >have applied in the mean time? Here: http://watchdog.sk/lkml/patches2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-22 Thread azurIt
>Unfortunately I am not able to reproduce this behavior even if I try >to hammer OOM like mad so I am afraid I cannot help you much without >further debugging patches. >I do realize that experimenting in your environment is a problem but I >do not many options left. Please do not use strace and rat

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-22 Thread azurIt
>Unfortunately I am not able to reproduce this behavior even if I try >to hammer OOM like mad so I am afraid I cannot help you much without >further debugging patches. >I do realize that experimenting in your environment is a problem but I >do not many options left. Please do not use strace and rat

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-10 Thread azurIt
>stuck in the ptrace code. But this happens _after_ the cgroup was freezed and i tried to strace one of it's processes (to see what's happening): Feb 8 01:29:46 server01 kernel: [ 1187.540672] grsec: From 178.40.250.111: process /usr/lib/apache2/mpm-itk/apache2(apache2:18211) attached to via

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread azurIt
> >I assume you have checked that the killed processes eventually die, >right? > When i killed them by hand, yes, they dissappeard from process list (i saw it). I don't know if they really died when OOM killed them. >Well, I do not see anything supsicious during that time period >(timestamps t

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread azurIt
>Which means that the oom killer didn't try to kill any task more than >once which is good because it tells us that the killed task manages to >die before we trigger oom again. So this is definitely not a deadlock. >You are just hitting OOM very often. >$ grep "killed as a result of limit" kern2.lo

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread azurIt
>kernel log would be sufficient. Full kernel log from kernel with you newest patch: http://watchdog.sk/lkml/kern2.log >This limit is for top level groups, right? Those seem to children which >have 62MB charged - is that a limit for those children? It was the limit for parent cgroup and proce

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-08 Thread azurIt
> >Do you have logs from that time period? > >I have only glanced through the stacks and most of the threads are >waiting in the mem_cgroup_handle_oom (mostly from the page fault path >where we do not have other options than waiting) which suggests that >your memory limit is seriously underestimate

Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set

2013-02-07 Thread azurIt
l, wrote this e-mail and go to my lovely bed ;) __ > Od: "Michal Hocko" > Komu: azurIt > Dátum: 06.02.2013 17:00 > Predmet: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is > set >

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread azurIt
>5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I >mentioned in a follow up email. Here is the full patch: Here is the log where OOM, again, killed MySQL server [search for "(mysqld)"]: http://www.watchdog.sk/lkml/oom_mysqld6 azur -- To unsubscribe from this list: send the l

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread azurIt
>5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I >mentioned in a follow up email. ou, it wasn't complete? i used it in my last test.. sorry, i'm litte confused by all those patches. will try it this night and report back. -- To unsubscribe from this list: send the line "unsu

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-02-05 Thread azurIt
>Sorry, to get back to this that late but I was busy as hell since the >beginning of the year. Thank you for your time! >Has the issue repeated since then? Yes, it's happening all the time but meanwhile i wrote a script which is monitoring the problem and killing freezed processes when it oc

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2013-01-25 Thread azurIt
Any news? Thnx! azur __ > Od: "Michal Hocko" > Komu: azurIt > Dátum: 30.12.2012 12:08 > Predmet: Re: [PATCH for 3.2.34] memcg: do not trigger OOM from > add_to_page_cache_locked > > CC: linu

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-29 Thread azurIt
>which suggests that the patch is incomplete and that I am blind :/ >mem_cgroup_cache_charge calls __mem_cgroup_try_charge for the page cache >and that one doesn't check GFP_MEMCG_NO_OOM. So you need the following >follow-up patch on top of the one you already have (which should catch >all the rema

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-24 Thread azurIt
>OK, good to hear and fingers crossed. I will try to get back to the >original problem and a better solution sometimes early next year when >all the things settle a bit. Btw, i noticed one more thing when problem is happening (=when any cgroup is stucked), i fogot to mention it before, sorry :(

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-24 Thread azurIt
>OK, good to hear and fingers crossed. I will try to get back to the >original problem and a better solution sometimes early next year when >all the things settle a bit. Michal, problem, unfortunately, happened again :( twice. When it happened first time (two days ago) i don't want to believe it

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-18 Thread azurIt
>It should mitigate the problem. The real fix shouldn't be that specific >(as per discussion in other thread). The chance this will get upstream >is not big and that means that it will not get to the stable tree >either. OOM is no longer killing processes outside target cgroups, so everything loo

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-17 Thread azurIt
>[Ohh, I am really an idiot. I screwed the first patch] >- bool oom = true; >+ bool oom = !(gfp_mask | GFP_MEMCG_NO_OOM); > >Which obviously doesn't work. It should read !(gfp_mask &GFP_MEMCG_NO_OOM). > No idea how I could have missed that. I am really sorry about that. :D no problem

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-16 Thread azurIt
>I would try to limit changes to minimum. So the original kernel you were >using + the first patch to prevent OOM from the write path + 2 debugging >patches. It didn't take off the whole system this time (but i was prepared to record a video of console ;) ), here it is: http://www.watchdog.sk/lk

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-10 Thread azurIt
>I would try to limit changes to minimum. So the original kernel you were >using + the first patch to prevent OOM from the write path + 2 debugging >patches. ok. >But was it at least related to the debugging from the patch or it was >rather a totally unrelated thing? I wasn't reading it much

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-10 Thread azurIt
>Hmm, this is _really_ surprising. The latest patch didn't add any new >logging actually. It just enahanced messages which were already printed >out previously + changed few functions to be not inlined so they show up >in the traces. So the only explanation is that the workload has changed >or the

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-09 Thread azurIt
>There are no other callers AFAICS so I am getting clueless. Maybe more >debugging will tell us something (the inlining has been reduced for thp >paths which can reduce performance in thp page fault heavy workloads but >this will give us better traces - I hope). Michal, this was printing so many

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-06 Thread azurIt
>Dohh. The very same stack mem_cgroup_newpage_charge called from the page >fault. The heavy inlining is not particularly helping here... So there >must be some other THP charge leaking out. >[/me is diving into the code again] > >* do_huge_pmd_anonymous_page falls back to handle_pte_fault >* do_hug

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-05 Thread azurIt
>OK, so the ENOMEM seems to be leaking from mem_cgroup_newpage_charge. >This can only happen if this was an atomic allocation request >(!__GFP_WAIT) or if oom is not allowed which is the case only for >transparent huge page allocation. >The first case can be excluded (in the clean 3.2 stable kernel

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-04 Thread azurIt
>The following should print the traces when we hand over ENOMEM to the >caller. It should catch all charge paths (migration is not covered but >that one is not important here). If we don't see any traces from here >and there is still global OOM striking then there must be something else >to trigger

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>The only strange thing I noticed is that some groups have 0 limit. Is >this intentional? >grep memory.limit_in_bytes cgroups | grep -v uid | sed 's@.*/@@' | sort | uniq >-c > 3 memory.limit_in_bytes:0 These are users who are not allowed to run anything. azur -- To unsubscribe from this l

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>Could you also post your complete containers configuration, maybe there >is something strange in there (basically grep . -r YOUR_CGROUP_MNT >except for tasks files which are of no use right now). Here it is: http://www.watchdog.sk/lkml/cgroups.gz -- To unsubscribe from this list: send the line "

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>> Here is the full boot log: >> www.watchdog.sk/lkml/kern.log > >The log is not complete. Could you paste the comple dmesg output? Or >even better, do you have logs from the previous run? What is missing there? All kernel messages are logging into /var/log/kern.log (it's the same as dmesg), dme

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>DMA32 zone is usually fills up first 4G unless your HW remaps the rest >of the memory above 4G or you have a numa machine and the rest of the >memory is at other node. Could you post your memory map printed during >the boot? (e820: BIOS-provided physical RAM map: and following lines) Here is the

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>Anyway your system is under both global and local memory pressure. You >didn't see apache going down previously because it was probably the one >which was stuck and could be killed. >Anyway you need to setup your system more carefully. There is, also, an evidence that system has enough of memory

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread azurIt
>Anyway your system is under both global and local memory pressure. You >didn't see apache going down previously because it was probably the one >which was stuck and could be killed. >Anyway you need to setup your system more carefully. No, it wasn't, i'm 1000% sure (i was on SSH). Here is the me

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-29 Thread azurIt
s only from cgroup which is out of memory. Here is the log from syslog: http://www.watchdog.sk/lkml/oom_mysqld Maybe i should mention that MySQL server has it's own cgroup (called 'mysql') but with no limits to any resources. azurIt -- To unsubscribe from this list: send the li

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-29 Thread azurIt
>Here we go with the patch for 3.2.34. Could you test with this one, >please? I installed kernel with this patch, will report back if problem occurs again OR in few weeks if everything will be ok. Thank you! azurIt -- To unsubscribe from this list: send the line "unsubscribe linux

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread azurIt
>Here we go with the patch for 3.2.34. Could you test with this one, >please? Michal, regarding to your conversation with Johannes Weiner, should i try this patch or not? azur -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.ke

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-26 Thread azurIt
>This issue has been around for a while so frankly I don't think it's >urgent enough to rush things. Well, it's quite urgent at least for us :( i wasn't reported this so far cos i wasn't sure it's a kernel thing. I will be really happy and thankfull if fix for this can go to 3.2 in some near fu

Re: memory-cgroup bug

2012-11-25 Thread azurIt
>This is hackish but it should help you in this case. Kamezawa, what do >you think about that? Should we generalize this and prepare something >like mem_cgroup_cache_charge_locked which would add __GFP_NORETRY >automatically and use the function whenever we are in a locked context? >To be honest I

Re: memory-cgroup bug

2012-11-25 Thread azurIt
>> Thank you very much, i will install it ASAP (probably this night). > >Please don't. If my analysis is correct which I am almost 100% sure it >is then it would cause excessive logging. I am sorry I cannot come up >with something else in the mean time. Ok then. I will, meanwhile, try to contact

Re: memory-cgroup bug

2012-11-25 Thread azurIt
>Inlined at the end of the email. Please note I have compile tested >it. It might produce a lot of output. Thank you very much, i will install it ASAP (probably this night). >dmesg | grep "Out of memory" >doesn't tell anything, right? Only messages for other cgroups but not for the freezed on

Re: memory-cgroup bug

2012-11-25 Thread azurIt
>So there is a lot of attempts to allocate which fail, every second! Yes, as i said, the cgroup was taking 100% of (allocated) CPU core(s). Not sure if all processes were using CPU but _few_ of them (not only one) for sure. -- To unsubscribe from this list: send the line "unsubscribe linux-kerne

Re: memory-cgroup bug

2012-11-24 Thread azurIt
>Could you take few snapshots over time? Here it is, now from different server, snapshot was taken every second for 10 minutes (hope it's enough): www.watchdog.sk/lkml/memcg-bug-2.tar.gz -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord

Re: memory-cgroup bug

2012-11-23 Thread azurIt
>If you could instrument mem_cgroup_handle_oom with some printks (before >we take the memcg_oom_lock, before we schedule and into >mem_cgroup_out_of_memory) If you send me patch i can do it. I'm, unfortunately, not able to code it. >> It, luckily, happend again so i have more info. >> >> - t

  1   2   >