Re: memory-cgroup bug

azurIt Thu, 22 Nov 2012 22:13:55 -0800

______________________________________________________________
> Od: "Kamezawa Hiroyuki" <kamezawa.hir...@jp.fujitsu.com>
> Komu: azurIt <azu...@pobox.sk>
> Dátum: 22.11.2012 01:27
> Predmet: Re: memory-cgroup bug
>
> CC: linux-kernel@vger.kernel.org, "linux-mm" <linux...@kvack.org>
>(2012/11/22 4:02), azurIt wrote:
>> Hi,
>>
>> i'm using memory cgroup for limiting our users and having a really strange 
>> problem when a cgroup gets out of its memory limit. It's very strange 
>> because it happens only sometimes (about once per week on random user), out 
>> of memory is usually handled ok. This happens when problem occures:
>>   - no new processes can be started for this cgroup
>>   - current processes are freezed and taking 100% of CPU
>>   - when i try to 'strace' any of current processes, the whole strace 
>> freezes until process is killed (strace cannot be terminated by CTRL-c)
>>   - problem can be resolved by raising memory limit for cgroup or killing of 
>> few processes inside cgroup so some memory is freed
>>
>> I also garbbed the content of /proc/<pid>/stack of freezed process:
>> [<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
>> [<ffffffff8110b5ab>] T.1146+0x5ab/0x5c0
>> [<ffffffff8110ba56>] mem_cgroup_charge_common+0x56/0xa0
>> [<ffffffff8110bae5>] mem_cgroup_newpage_charge+0x45/0x50
>> [<ffffffff810ec54e>] do_wp_page+0x14e/0x800
>> [<ffffffff810eda34>] handle_pte_fault+0x264/0x940
>> [<ffffffff810ee248>] handle_mm_fault+0x138/0x260
>> [<ffffffff810270ed>] do_page_fault+0x13d/0x460
>> [<ffffffff815b53ff>] page_fault+0x1f/0x30
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> I'm currently using kernel 3.2.34 but i'm having this problem since 2.6.32.
>>
>> Any ideas? Thnx.
>>
>
>Under OOM in memcg, only one process is allowed to work. Because processes 
>tends to use up
>CPU at memory shortage. other processes are freezed.
>
>
>Then, the problem here is the one process which uses CPU. IIUC, 'freezed' 
>threads are
>in sleep and never use CPU. It's expected oom-killer or memory-reclaim can 
>solve the probelm.
>
>What is your memcg's memory.oom_control value ?




oom_kill_disable 0



>and process's oom_adj values ? (/proc/<pid>/oom_adj, /proc/<pid>/oom_score_adj)


when i look to a random user PID (Apache web server):
oom_adj = 0
oom_score_adj = 0

I can look also to the data of 'freezed' proces if you need it but i will have 
to wait until problem occurs again.

The main problem is that when this problem happens, it's NOT resolved 
automatically by kernel/OOM and user of cgroup, where it happend, has 
non-working services until i kill his processes by hand. I'm sure that all 
'freezed' processes are taking very much CPU because also server load goes 
really high - next time i will make a screenshot of htop. I really wonder why 
OOM is __sometimes__ not resolving this (it's usually is, only sometimes not).


Thank you!

azur
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: memory-cgroup bug

Reply via email to