On Thu, Oct 11, 2012 at 10:57:39AM +0200, Michal Hocko wrote: > oom_badness takes totalpages argument which says how many pages are > available and it uses it as a base for the score calculation. The value > is calculated by mem_cgroup_get_limit which considers both limit and > total_swap_pages (resp. memsw portion of it). > > This is usually correct but since fe35004f (mm: avoid swapping out > with swappiness==0) we do not swap when swappiness is 0 which means > that we cannot really use up all the totalpages pages. This in turn > confuses oom score calculation if the memcg limit is much smaller than > the available swap because the used memory (capped by the limit) is > negligible comparing to totalpages so the resulting score is too small > if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj). > A wrong process might be selected as result. > > The same issue exists for the global oom killer as well but it is not > that problematic as the amount of the RAM is usually much bigger than > the swap space. > > The problem can be worked around by checking mem_cgroup_swappiness==0 > and not considering swap at all in such a case. > > Signed-off-by: Michal Hocko <mho...@suse.cz> > Acked-by: David Rientjes <rient...@google.com> > Cc: stable [3.5+]
I also don't think it's hackish, the limit depends very much on whether reclaim can swap, so it's natural that swappiness shows up here. Acked-by: Johannes Weiner <han...@cmpxchg.org> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/