Michal Hocko wrote:
> Hmm, so you are creating a separate process (from the signal point of
> view) and I suspect it is one of those that holds the last reference to
> the mm_struct which is released here and it has tsk_oom_victim = F

Right.

> So we need a more robust test for the oom victim. Your suggestion is
> basically what I came up with originally [1] and which was deemed
> ineffective because we took the mmap_sem even for regular paths and
> Kirill was afraid this adds some unnecessary cycles to the exit path
> which is quite hot.
> 
> So I guess we have to do something else instead. We have to store the
> oom flag to the mm struct as well. Something like the patch below.

Yes, adding a new flag for this purpose will work.

Also, setting MMF_UNSTABLE flag between after sending SIGKILL and before
victim->mm becomes NULL and testing MMF_UNSTABLE at exit_mm() should work.

But I prefer simple revert + mmget()/mmput_async() approach at
http://lkml.kernel.org/r/201712062037.daf90168.svfqojfmoot...@i-love.sakura.ne.jp
 , for
my approach not only saves lines but also fixes unexpected change for nommu at
http://lkml.kernel.org/r/201711091949.bdb73475.oshfomqtlfo...@i-love.sakura.ne.jp
 .
Also, if we replace asynchronous OOM reaping by the OOM reaper kernel thread 
with
synchronous OOM reaping by the OOM killer, we can close MMF_OOM_SKIP race window
because it is guaranteed that __oom_reap_task_mm() is called before __mmput() is
called.

Reply via email to